AeonG is a temporal graph database that efficiently supports temporal features based on Memgraph. We provide a formally defined temporal property graph model, based on which we fundamentally design AeonG with a temporal-enhanced storage engine and query engine. AeonG utilizes a hybrid storage engine, in which we introduce a current store for maintaining current graphs and a historical store for storing historical graphs migrated from the current storage under MVCC management. Furthermore, AeonG equips a native temporal query engine to process temporal queries with efficient capability and data consistent guarantee.
- Fast querying capabilities over subgraphs at a past time point or range
- Small storage overhead of historical data
- Native support of transaction time
- ACID compliance
You can refer to the comprehensive documentation provided by Memgraph for building system dependencies. Additionally, we offer a Docker image to streamline this process. We highly recommend utilizing Docker for building AeonG.
docker pull hououou/aeong:v1
docker run -it -p 7687:7687 -p 7444:7444 --mount type=bind source=$pwd,target=/home/ --entrypoint bash aeong
Before compiling AeonG, you should activate the toolchain, which is used our own custom toolchain.
source /opt/toolchain-vXYZ/activate
Apart from the system-wide installed dependencies, AeonG needs some libraries to be built locally. The proper setup of these libraries should be checked by running the init script.
cd aeong
./init
With all of the dependencies installed and the build environment set-up, you need to configure the build system. To do that, execute the following:
mkdir -p build
cd build
cmake ..
If everything went OK, you can now, finally, run build AeonG binary and client binary.
make -j$(nproc) memgraph
cd tests/mgbench
make
After the compliation, you can run AeonG
./memgraph
We provide support for three temporal benchmarks to evaluate temporal performance. Additional details can be found in tests/benchmarks/README.md
-
We can automatically generate graph operation query statements for generating temporal data. To do that, execute the following:
cd tests/benchmarks/$workloadname python create_graph_op_quries.py --arg $arg_value
-
We can generate temporal query statements for evaluating temporal performance. To do that, execute the following:
cd tests/benchmarks/$workloadname python create_temporal_query.py --arg $arg_value
We provide tools for creating temporal databases and evaluating temporal database performance. These tools can be found in the srcipt directory.
We provide a tool that can report the average graph operation query latency and the storage consumption of the generated temporal database. To use it, execute the following:
cd tests/scripts/
python3 create_temporal_database.py
You can specify optional arguments to generate the desired temporal database. Check specific arguments for each workload by executing:
python create_temporal_database.py --help
Flag | Description |
---|---|
--aeong-binary | AeonG binary |
--client-binary | Client binary |
--num-workers | Number of workers |
--data-directory | Directory path where temporal database should be stored |
--original-dataset-cypher-path | Directory path indicating the original dataset cypher query statements |
--index-cypher-path | Index query path |
--graph-operation-cypher-path | Directory path indicating where the graph operation query statements should be stored |
--no-properties-on-edges | Disable properties on edges |
We provide a tool that can report the average temporal query latency.
cd tests/scripts/
python3 evaluate_temporal_q.py
The arguments are almost the same as for create_temporal_database.py, except for --temporal-query-cypher-path
indicating the temporal query path. You can specify optional arguments to generate the desired temporal database. Check specific arguments for each workload by executing:
python evaluate_temporal_q.py --help
To reproduce our paper's results, please follow the steps outlined below. First, download the mgBench, LDBC, and gMark datasets. More detailed information about these datasets is available in our benchmark directory. Additionally, our system and baseline systems can be obtained through the following Docker image. You need to first build our system and baseline systems, using the following scripts. Then the binary of baseline systems can be found in our Docker image at /home/clockg[memgraph-master/aeong]/build/memgraph.
docker pull hououou/aeong:v2
docker run it aeong:v2
cd /home/aeong[memgraph-master/clockg]
mkdir build
cmake ..
make -j$(nproc) memgraph
We provide scripts based on our benchmark generation tools and test tools to compare our system with baseline systems, Clock-G, and T-GQL. These scripts are available in the srcipt directory, and you can customize them based on your needs.
-
For Figure 8(a), 8(b), 9(a), 9(b), 9(c), and 9(d), where
$num_op
indicates the number of graph operations,$clockg_binary
indicates the binary path of Clock-G, and$memgraph_binary
indicates the binary path of T-GQL.cd tests/experiments ./t_mgBench_test.sh $num_op $clockg_binary $memgraph_binary
-
For Figure 8(c) and 9(e), use the following script to test the LDBC workload. Note that the LDBC workload is substantial and will take a considerable amount of time.
cd tests/experiments ./t_ldbc_test.sh $clockg_binary $memgraph_binary
-
For Figure 8(d) and 9(f), use the following script.
cd tests/experiments ./t_gmark_test.sh $clockg_binary $memgraph_binary
To assess the performance of AeonG, please follow the steps outlined below.
-
Figure 10. This experiment does not require creating a temporal database. Evaluate queries inherited from the original mgBench, LDBC, and gMark benchmarks. First, load the original datasets into each database. Then, test AeonG compared to Memgraph by specifying the database path
--data-directory
, executor binary path (either AeonG or Memgraph)--aeong-binary
, and evaluated query path--temporal-query-cypher-path
.cd tests/tools/ python3 evaluate_temporal_q.py --data-directory $database_path --aeong-binary $binary_path(aeong/memgraph) --temporal-query-cypher-path $query_cypher_path_value
-
Figure 11(a). Use the following command by specifying the database path
--data-directory
, the query path--temporal-query-cypher-path
, and GC cycle seconds--storage-gc-cycle-sec
.cd tests/tools/ python3 evaluate_temporal_q.py --data-directory $database_path --temporal-query-cypher-path $query_cypher_path_value --storage-gc-cycle-sec $gc_interval_value
-
Figure 11(b). Evaluate the effect of the anchor number. First, create a temporal database by varying
--anchor_num
. Then, evaluate temporal query latency.cd tests/tools/ python3 create_temporal_database.py --anchor_num $anchor_num_value --data-directory $data_directory python3 evaluate_temporal_q.py --data-directory $data_directory
-
Figure 11(c). Evaluate the retention period. Create a temporal database by varying the
--retention-period-sec
and then evaluate temporal query performance.cd tests/tools/ python3 create_temporal_database.py --retention-period-sec $retention_period --data-directory $data_directory python3 evaluate_temporal_q.py --data-directory $data_directory
-
Figure 11(d). Install TiKV first, then checkout the Aeon-G branch and rebuild the project. Vary temporal data by setting
--graph-operation-cypher-path
and set different cluster nodes of TiKV.git checkout Aeon-G cd build cmake .. make -j$(nproc) cd tests/tools/ python3 create_temporal_database.py --data-directory $data_directory --graph-operation-cypher-path $graph_op_path python3 evaluate_temporal_query.py --data-directory $data_directory
You can also test AeonG performance according to your needs. We guide you with following steps:
- Download dataset
- Generate graph operation query statements. You can use generation tools in our benchmarks directory (
/benchmarks/$workload_name/create_graph_op_queries.py
). - Create temporal database. You can use the tool in our script directory (
/tests/scripts/create_temporal_database.py
). It will report the graph operation query latency and storage consumption. - Generate temporal query statements. You can use generation tools in our benchmarks directory (
/benchmarks/$workload_name/create_temporal_query.py
). - Evaluate temporal performance. You can use the tool in our script directory (
/tests/scripts/evaluate_temporal_q.py
). It will report the temporal query latency.
AeonG is an extension of Memgraph. Details of our concept can be found in our paper. We also refer Memgraph's internal documentation to better understand our codes. We made following major changes to support temporal features.
- Storage Engine:
- Timestamps: Import timestamps into Vertex, Edge, and Delta strutures.
- Data Migration: Add data migration in the Storage::CollectGarbage() function to migrate unreclaimed data to RocksDB.
- Retain Historical Data in RocksDB: Utilize historical_delta.cpp to transfer deltas to key-value formats and store them to RocksDB.
- Query Engine:
- Add Temporal Syntax in Cypher.g4.
- Extend Scan Operator: In the ScanAllCursor.Pull() function, we introduce a function AddHistoricalVertices() to capture both unreclaimed and reclaimed historical versions.
- Extend scan operator: In the ExpandCursor.Pull() function, we introduce a function AddHistoricalEdges() to capture both unreclaimed and reclaimed historical versions.
We inherit the configuration settings from Memgraph, thus supporting all configurations described in Memgraph. For detailed information, please refer to this link. Additionally, AeonG supports three more configurations to provide temporal features.
Flag | Description |
---|---|
--bolt-port | Port on which the Bolt server should listen. |
--data-directory | Path to directory in which to save all permanent data. |
--data-recovery-on-startup | Facilitates recovery of one or more individual databases and their contents during startup. Replaces --storage-recover-on-startup |
--storage-gc-cycle-sec | Storage garbage collector interval (in seconds). |
--storage-recover-on-startup | Deprecated and replaced with the data_recovery_on_startup flag. Controls whether the storage recovers persisted data on startup. |
--storage-properties-on-edges | Controls whether edges have properties. |
--storage-snapshot-interval-sec | Storage snapshot creation interval (in seconds). Set to 0 to disable periodic snapshot creation. |
--storage-snapshot-retention-count | The number of snapshots that should always be kept. |
Flag | Description | Default |
---|---|---|
--retention-period-sec | Reclaim history period (in seconds). Set to 0 to disable reclaiming history from historical storage. | 0 |
--retention-cycle-sec | Reclaim history interval (in seconds). | 60 |
--anchor-num | Anchor number between two deltas. Set to 0 to use our multiple anchor strategies. | 0 |