Learning from Massive Human Videos for Universal Humanoid Pose Control

🌐 Homepage | ⛁ Dataset | 🤗 Models | 📑 Paper | 💻 Code

Code for paper Learning from Massive Human Videos for Universal Humanoid Pose Control. Please refer to our project page for more demonstrations and up-to-date related resources.

UH-1 Model: Language-conditioned Humanoid Control

Dependencies

To establish the environment, run this code in the shell:

conda create -n UH-1 python=3.8.11
conda activate UH-1
pip install git+https://github.com/openai/CLIP.git
pip install mujoco opencv-python

Preparation

Download our text-to-keypoint model checkpoints from here.

git lfs install
git clone https://huggingface.co/USC-GVL/UH-1

Inference

For text-to-keypoint generation,

Change the root_path in inference.py to the path of the checkpoints you just downloaded.
Change the prompt_list in inference.py to the language prompt you what the model to generate.
Run the following commands, and the generated humanoid motion will be stored in the output folder.

python inference.py

The generated keypoint is in this shape: [number of frames, 34-dim keypoint], where the 34-dim keypoint = 27-dim DoFs joint pose value + 3-dim root position + 4-dim root orientation.

Visualize

Visualize these keypoints by directly setting DoFs pose,

Change the file_list in visualize.py to the generated humaoid motion file names.
Run the following commands, and the rendered video will be stored in the output folder.

mjpython visualize.py

If you want to do close-loop control conditioned on the generated humanoid keypoints, you need to use the goal-conditioned humanoid control policy provided below.

Goal-conditioned Humanoid Control Policy

Dependencies

To set up the conda environment for Isaac Gym while avoiding dependency conflicts, we chose to create a new environment.

conda create -n UH-1-rl python=3.8
conda activate UH-1-rl
pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 torchaudio==0.13.1+cu117 -f https://download.pytorch.org/whl/torch_stable.html
pip install oauthlib==3.2.2 protobuf==5.28.1

# Download the Isaac Gym binaries from https://developer.nvidia.com/isaac-gym 
cd isaacgym/python && pip install -e .

# then make sure you are at the root folder of this project 
cd rsl_rl && pip install -e .
cd ../legged_gym && pip install -e .

pip install "torch==1.13.1" "numpy==1.23.0" pydelatin==0.2.8 wandb==0.17.5 tqdm opencv-python==4.10.0.84 ipdb pyfqmr==0.2.1 flask dill==0.3.8 gdown==5.2.0 pytorch_kinematics==0.7.4 easydict==1.13

Here is a sample of our training data. Due to the file size limit of Github, the data file can be downloaded here. Please put the data file at motion_lib/motion_pkl/motion_data_cmu_sample.pkl

Inference

To play the policy with the checkpoint we"ve provided, try

# make sure you are at the root folder of this project 
cd legged_gym/legged_gym/scripts
python play.py 000-00 --task h1_2_mimic --device cuda:0

Train from scratch

To train the goal-conditioned RL policy from scratch, try

# make sure you are at the root folder of this project 
cd legged_gym/legged_gym/scripts
python train.py xxx-xx-run_name --task h1_2_mimic --device cuda:0

Humanoid-X Data Collection

For the data collection pipeline, including Video Clip Extraction, 3D Human Pose Estimation, Video Captioning, and Motion Retargetting, please refer to this README.

Citation

If you find our work helpful, please cite us:

@article{mao2024learning,
  title={Learning from Massive Human Videos for Universal Humanoid Pose Control},
  author={Mao, Jiageng and Zhao, Siheng and Song, Siqi and Shi, Tianheng and Ye, Junjie and Zhang, Mingtong and Geng, Haoran and Malik, Jitendra and Guizilini, Vitor and Wang, Yue},
  journal={arXiv preprint arXiv:2412.14172},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
assets		assets
legged_gym		legged_gym
models		models
motion_lib		motion_lib
output		output
retarget		retarget
robots/h1_2		robots/h1_2
rsl_rl		rsl_rl
.gitignore		.gitignore
README-Humanoid-X.md		README-Humanoid-X.md
README.md		README.md
inference.py		inference.py
option.py		option.py
visualize.py		visualize.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Learning from Massive Human Videos for Universal Humanoid Pose Control

🌐 Homepage | ⛁ Dataset | 🤗 Models | 📑 Paper | 💻 Code

UH-1 Model: Language-conditioned Humanoid Control

Dependencies

Preparation

Inference

Visualize

Goal-conditioned Humanoid Control Policy

Dependencies

Inference

Train from scratch

Humanoid-X Data Collection

Citation

About

Releases

Packages

Contributors 2

Languages

sihengz02/UH-1

Folders and files

Latest commit

History

Repository files navigation

Learning from Massive Human Videos for Universal Humanoid Pose Control

🌐 Homepage | ⛁ Dataset | 🤗 Models | 📑 Paper | 💻 Code

UH-1 Model: Language-conditioned Humanoid Control

Dependencies

Preparation

Inference

Visualize

Goal-conditioned Humanoid Control Policy

Dependencies

Inference

Train from scratch

Humanoid-X Data Collection

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages