Kaidi Cao, Phitchaya Mangpo Phothilimthana, Sami Abu-El-Haija, Dustin Zelle, Yanqi Zhou, Charith Mendis, Jure Leskovec, Bryan Perozzi
This is the implementation of GST EFD in the paper Learning Large Graph Property Prediction via Graph Segment Training in PyTorch.
The codebase is developed based on GraphGPS. Installing the environment follwoing its instructions.
- MalNet, the split info of Malnet-Large is provided in splits folder.
- TpuGraphs.
We provide several training examples with this repo:
python main.py --cfg configs/malnetlarge-GST.yaml
For TpuGraphs dataset, download the dataset following instructions here, by default, put the train/valid/test
splits under the folder ./datasets/TPUGraphs/raw/npz/layout/xla/random
. To run on other collections, modify source
and search
in in tpu_graphs.py.
You can train by invoking:
python main_tpugraphs.py --cfg configs/tpugraphs.yaml
Please change device
from cuda
to cpu
in the yaml file if you want to try cpu only training.
To evaluate on TpuGraphs dataset, run
python test_tpugraphs.py --cfg configs/tpugraphs.yaml
If memory is not sufficient, change batch_size
to 1 during evaluation. Set cfg.train.ckpt_best
to True
to save the best validation model during training for further evaluation.
To create your own custom model, you can supply a configuration (e.g., by copying configs/tpugraphs.yaml) and set the attribute type
(inside of model
) to some string that you register in network/custom_tpu_gnn.py.
If you find our paper and repo useful, please cite as
@article{cao2023learning,
title={Learning Large Graph Property Prediction via Graph Segment Training},
author={Cao, Kaidi and Phothilimthana, Phitchaya Mangpo and Abu-El-Haija, Sami and Zelle, Dustin and Zhou, Yanqi and Mendis, Charith and Leskovec, Jure and Perozzi, Bryan},
journal={arXiv preprint arXiv:2305.12322},
year={2023}
}