OTPose: Occlusion-Aware Transformer for Pose Estimation in Sparsely-Labeled Videos (SMC 2022 Oral)

This is an official repo for OTPose: Occlusion-Aware Transformer for Pose Estimation in Sparsely-Labeled Videos. [Paper]

Description

We propose OTPose (Occlusion-aware Transformer for Pose estimation) that explicitly encodes occlusion using a mask as a semi-supervised task and intuitively understands temporal dependency using a transformer encoder. We effectively accumulate occlusion-specific features and introduce an attention mask that focuses on the overlapped area of easily occluded keypoints. In addition, two temporal-domain branches independently encode unique pose features that focus on past and future frames, respectively.

Getting Started

Environment Requirement.

conda create -n OTPose python=3.6.12
conda activate OTPose
pip install -r requirements.txt

Install DCN.

cd thirdparty/deform_conv
python setup.py develop

Data preparation

First, create a folder ${DATASET_DIR} to store the data of PoseTrack17 and PoseTrack18.

The directory structure should look like this:

${DATASET_DIR}
	|--${POSETRACK17_DIR}
	|--${POSETRACK18_DIR}

# For example, our directory structure is as follows.
# If you don't know much about configuration file(.yaml), please refer to our settings.
DataSet
	|--PoseTrack17
	|--PoseTrack18

For PoseTrack17 data, we use a slightly modified version of the PoseTrack dataset where we rename the frames to follow d format, with first frame indexed as 1 (i.e. 00000001.jpg). First, download the data from PoseTrack download page. Then, rename the frames for each video as described above using this script.

Like PoseWarper and DCPose, We provide all the required JSON files, which have already been converted to COCO format. Evaluation is performed using the official PoseTrack evaluation code, poseval, which uses py-motmetrics internally. We also provide required MAT/JSON files that are required for the evaluation.

Your extracted PoseTrack17 directory should look like this:

|--${POSETRACK17_DIR}
	|--images
        |-- bonn
        `-- bonn_5sec
        `-- bonn_mpii_test_5sec
        `-- bonn_mpii_test_v2_5sec
        `-- bonn_mpii_train_5sec
        `-- bonn_mpii_train_v2_5sec
        `-- mpii
        `-- mpii_5sec
    |--images_renamed   # first frame indexed as 1  (i.e. 00000001.jpg)
     	|-- bonn
        `-- bonn_5sec
        `-- bonn_mpii_test_5sec
        `-- bonn_mpii_test_v2_5sec
        `-- bonn_mpii_train_5sec
        `-- bonn_mpii_train_v2_5sec
        `-- mpii
        `-- mpii_5sec

For PoseTrack18 data, please download the data from PoseTrack download page. Since the video frames are named properly, you only need to extract them into a directory of your choice (no need to rename the video frames). As with PoseTrack17, we provide all required JSON files for PoseTrack18 dataset as well.

Your extracted PoseTrack18 images directory should look like this:

${POSETRACK18_DIR}
    |--images
        |-- test
        `-- train
        `-- val

Create Symbolic link

ln -s  ${OTPose_SUPP_DIR}  ${OTPose_Project_Dir}  # For OTPose supplementary file
ln -s  ${DATASET_DIR}  ${OTPose_Project_Dir}      #  For Dataset


# For example
${OTPose_Project_Dir} = /your/project/path/Pose_Estimation_OTPose
${OTPose_SUPP_DIR}    = /your/supp/path/OTPose_supp_files
${DATASET_DIR}        = /your/dataset/path/DataSet

ln -s /your/supp/path/OTPose_supp_files  /your/project/path/Pose_Estimation_OTPose  # SUP File Symbolic link
ln -s /your/dataset/path/DataSet         /your/project/path/Pose_Estimation_OTPose  # DATASET Symbolic link 2

Training from scratch

For PoseTrack17

cd tools
# train
python train.py --cfg ../configs/posetimation/OTPose/posetrack17/model_RSN.yaml
# val
python eval.py --cfg ../configs/posetimation/OTPose/posetrack17/model_RSN.yaml

The results are saved in ${OTPose_Project_Dir}/output/PE/OTPose/OTPose/PoseTrack17/{Network_structure _hyperparameters} by default

For PoseTrack18

cd tools
# train
python train.py --cfg ../configs/posetimation/OTPose/posetrack18/model_RSN.yaml
# val
python eval.py --cfg ../configs/posetimation/OTPose/posetrack18/model_RSN.yaml

The results are saved in ${OTPose_Project_Dir}/output/PE/OTPose/OTPose/PoseTrack18/{Network_structure _hyperparameters} by default

Validating/Testing from our pretrained models

We will prepare our pretrained model as soon as possible issues of our GPU server are resovled.

# Evaluate on the PoseTrack17 validation set
python run.py --cfg ../configs/posetimation/OTPose/posetrack17/model_RSN_trained.yaml --val
# Evaluate on the PoseTrack17 test set
python run.py --cfg ../configs/posetimation/OTPose/posetrack17/model_RSN_trained.yaml --test

Run on video

We will prepare all visualization codes in this repo but we need to test it. We will prepare the run command as soon as possible.

Citation

@inproceedings{jin2022otpose,
  title={OTPose: Occlusion-Aware Transformer for Pose Estimation in Sparsely-Labeled Videos},
  author={Jin, Kyung-Min and Lee, Gun-Hee and Lee, Seong-Whan},
  booktitle={2022 IEEE International Conference on Systems, Man, and Cybernetics (SMC)},
  pages={3255--3260},
  year={2022},
  organization={IEEE}
}

Acknowledgements

The code is built upon DCPose.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OTPose: Occlusion-Aware Transformer for Pose Estimation in Sparsely-Labeled Videos (SMC 2022 Oral)

Description

Getting Started

Data preparation

Create Symbolic link

Training from scratch

Validating/Testing from our pretrained models

Run on video

Citation

Acknowledgements

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
configs		configs
dataset		dataset
docs		docs
model		model
object_detector/YOLOv3		object_detector/YOLOv3
script		script
thirdparty		thirdparty
utils		utils
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
eval.py		eval.py
requirements.txt		requirements.txt
setup.py		setup.py
train.py		train.py

KyungMinJin/OTPose

Folders and files

Latest commit

History

Repository files navigation

OTPose: Occlusion-Aware Transformer for Pose Estimation in Sparsely-Labeled Videos (SMC 2022 Oral)

Description

Getting Started

Data preparation

Create Symbolic link

Training from scratch

Validating/Testing from our pretrained models

Run on video

Citation

Acknowledgements

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages