introduction

toy-graph is a purely c CPU based multithreading graph computing system aiming to support billions vertexes. It's originally developed for computing pagerank for infomall web pages dataset. Thanks to Dr. Ma's guidance.

quick start

use pagerank as an example, for api usage, see pagerank example.

install bazel.

toy-graph is built using bazel, follow the installation guide to install it.

compile pagerank program.

bazel build //example/pagerank:main

prepare graph input file

raw input file is test/pagerank_test_graph, each line is an edge, format is src dst src_degree(pagerank task specific). we need to convert this file to binary first, we use 4 bytes to represent each variable(12 bytes each edge). run

bazel build //tool:convert_to_binary

to compile convert tool. In test directory, run

../bazel-bin/tool/convert_to_binary --input pagerank_test_graph
--output pagerank_test_graph.bin

then we get .bin binary graph input file.

run pagerank example

in toy-graph project directory, run

./bazel-bin/example/pagerank/main --input
test/pagerank_test_graph.bin --output test/pagerank_test_graph.pagerank
--vertexNum 5 --edgeNum 5 --iterations 1

finally we get test/pagerank_test_graph.pagerank result file, each line is vertex_index pagerank_value format.

system requirements(only linux tested)

Currently toy-graph supports single machine & vertex in memory & pull computation, one thread to read, multi threads to compute, memory cost is each_vertex_cost * num_of_vertex. There is no disk space needed except result file.

performance

we use valgrind to do memory check, such as memory leak, we use gtest to do unit test).
In order to reduce memory cost, all vertexes are stored on a continuous memory buffer instead of a vector of objects(aligned memory can increase memory cost). We leave each vertex memory management to user themselves.
to be continued(such as comparative experiments).

road map

speed optimization, unit tests and experiment results.
support push computation model.
add code comments, add docs.
support multi platforms.
support distributed computation.

Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
core		core
example/pagerank		example/pagerank
test		test
third_party		third_party
tool		tool
.gitignore		.gitignore
README.md		README.md
WORKSPACE		WORKSPACE
run_example_pagerank.sh		run_example_pagerank.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

introduction

quick start

system requirements(only linux tested)

performance

road map

About

Releases

Packages

Languages

suiyuan2009/toy-graph

Folders and files

Latest commit

History

Repository files navigation

introduction

quick start

system requirements(only linux tested)

performance

road map

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages