This is the official code repository of our ICML paper "An End-to-End Framework for Molecular Conformation Generation via Bilevel Programming" (2021).
You can follow the instructions here. We adopt an environment the same as our another previous project.
# Create conda environment
conda create --name ConfVAE python=3.7
# Activate the environment
conda activate ConfVAE
# Install packages
conda install pytorch torchvision cudatoolkit=10.2 -c pytorch
conda install rdkit==2020.03.3 -c rdkit
conda install tqdm networkx scipy scikit-learn h5py tensorboard -c conda-forge
pip install torchdiffeq==0.0.1
# Install PyTorch Geometric
conda install pytorch-geometric -c rusty1s -c conda-forge
The official datasets are available here.
The dataset file is a pickled Python list consisting of rdkit.Chem.rdchem.Mol
objects. Each conformation is stored individually as a Mol
object. For example, if a dataset contains 3 molecules, where the first molecule has 4 conformations, the second one and the third one have 5 and 6 conformations respectively, then the pickled Python list will contain 4 5 6 Mol
objects in total.
The output format is identical to the input format.
Example: training a model for QM9 molecules.
python train_vae.py \
--train_dataset ./data/qm9/train_QM9.pkl \
--val_dataset ./data/qm9/val_QM9.pkl
More training options can be found in train_vae.py
.
Example: generating conformations for each molecule in the QM9 test-split, with twice the number of test set for each molecule.
python eval_vae.py \
--ckpt ./logs/VAE_QM9 \
--dataset ./data/iclr/qm9/test_QM9.pkl \
--num_samples -2
More generation options can be found in eval_vae.py
.
Please consider citing our work if you find it helpful.
@inproceedings{
xu2021end,
title={An End-to-End Framework for Molecular Conformation Generation via Bilevel Programming},
author={Xu, Minkai and Wang, Wujie and Luo, Shitong and Shi, Chence and Bengio, Yoshua and Gomez-Bombarelli, Rafael and Tang, Jian},
booktitle={International Conference on Machine Learning},
year={2021}
}
If you have any question, please contact me at [email protected] or [email protected].
Please also check our another concurrent work on molecular conformation generation, which has also been accepted in ICML'2021 (Long Talk): Learning Gradient Fields for Molecular Conformation Generation. [Code]