VPP: Efficient Conditional 3D Generation via Voxel-Point Progressive Representation, NeurIPS 2023
Zekun Qi, Muzhou Yu, Runpei Dong and Kaisheng Ma
- 🎆 Sep, 2023: VPP is accepted to NeurIPS 2023.
- 💥 Aug, 2023: Check out our previous work ACT and ReCon about 3D represent learning, which have been accepted by ICLR & ICML 2023.
This repository contains the code release of VPP⚡: Efficient Conditional 3D Generation via Voxel-Point Progressive Representation.
PyTorch >= 1.7.0; python >= 3.7; CUDA >= 9.0; GCC >= 4.9; torchvision;
# Quick Start
conda create -n vpp python=3.8 -y
conda activate vpp
conda install pytorch==1.10.0 torchvision==0.11.0 cudatoolkit=11.3 -c pytorch -c nvidia
# pip install torch==1.10.0 cu113 torchvision==0.11.0 cu113 torchaudio==0.10.0 cu113 -f https://download.pytorch.org/whl/torch_stable.html
# Install basic required packages
pip install -r requirements.txt
# Chamfer Distance
cd ./extensions/chamfer_dist && python setup.py install --user
# install sap to recontruct meshes from point clouds
cd sap
cd pointnet2_ops_lib && pip install -e .
wget https://github.com/facebookresearch/pytorch3d/archive/refs/tags/v0.6.1.zip
unzip v0.6.1.zip
cd pytorch3d-0.6.1/ && pip install -e .
See DATASET.md for details.
sh scripts/train_vqgan.sh <gpu_id>
sh scripts/train_voxel_generator.sh <gpu_id>
sh scripts/train_grid_smoother.sh <gpu_id>
sh scripts/train_point_upsampler.sh <gpu_id>
text prompt:
sh scripts/inference/text_prompt.sh <gpu_id> "a round table."
image prompt:
sh scripts/inference/image_prompt.sh <gpu_id> <img_path>
Download the pretrained sap model.
cd sap
export CUDA_VISIBLE_DEVICES=0 && python mesh_reconstruction.py --config ../sap.json --ckpt ../sap.pkl --dataset_path ../points.npz --save_dir output/
The shape as points reconstruction pipeline originates from SLIDE, which has been trained on multi-category ShapeNet datasets with artificial noise.
We use PointVisualizaiton repo to render beautiful point cloud image, including specified color rendering and attention distribution rendering.
If you have any questions related to the code or the paper, feel free to email Zekun ([email protected]
).
VPP is released under MIT License. See the LICENSE file for more details.
This codebase is built upon Point-MAE, CLIP and SLIDE
If you find our work useful in your research, please consider citing:
@inproceedings{vpp2023,
title={{VPP}: Efficient Universal 3D Generation via Voxel-Point Progressive Representation},
author={Qi, Zekun and Yu, Muzhou and Dong, Runpei and Ma, Kaisheng},
booktitle={Thirty-seventh Conference on Neural Information Processing Systems},
year={2023},
url={https://openreview.net/forum?id=etd0ebzGOG}
}