forked from snap-research/articulated-animation
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Your Name
committed
Apr 29, 2021
1 parent
4da8841
commit 6b20297
Showing
34 changed files
with
9,094 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 1,6 @@ | ||
Copyright Snap Inc. 2021. This sample code is made available by Snap Inc. for informational purposes only. No license, | ||
whether implied or otherwise, is granted in or to such code (including any rights to copy, modify, publish, distribute | ||
and/or commercialize such code), unless you have entered into a separate agreement for such rights. Such code is | ||
provided as-is, without warranty of any kind, express or implied, including any warranties of merchantability, title, | ||
fitness for a particular purpose, non-infringement, or that such code is free of defects, errors or viruses. In no event | ||
will Snap Inc. be liable for any damages or losses of any kind arising from the sample code or your use thereof. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 1,98 @@ | ||
# Motion Representations for Articulated Animation | ||
|
||
This repository contains the source code for the CVPR'2021 paper [Motion Representations for Articulated Animation](https://arxiv.org/abs/2104.11280) by [Aliaksandr Siarohin](https://aliaksandrsiarohin.github.io/aliaksandr-siarohin-website/), [Oliver Woodford](https://ojwoodford.github.io/), [Jian Ren](https://alanspike.github.io/), [Menglei Chai](https://mlchai.com/) and [Sergey Tulyakov](http://www.stulyakov.com/). | ||
|
||
For more qualitiative examples visit our [project page](https://snap-research.github.io/articulated-animation/). | ||
|
||
## Example animation | ||
|
||
Here is an example of several images produced by our method. In the first column the driving video is shown. For the remaining columns the top image is animated by using motions extracted from the driving. | ||
|
||
![Screenshot](sup-mat/teaser.gif) | ||
|
||
### Installation | ||
|
||
We support ```python3```. To install the dependencies run: | ||
```bash | ||
pip install -r requirements.txt | ||
``` | ||
|
||
### YAML configs | ||
|
||
There are several configuration files one for each `dataset` in the `config` folder named as ```config/dataset_name.yaml```. See ```config/dataset.yaml``` to get the description of each parameter. | ||
|
||
See description of the parameters in the ```config/vox256.yaml```. We adjust the the configuration to run on 1 V100 GPU, training on 256x256 dataset takes approximatly 2 days. | ||
|
||
### Pre-trained checkpoints | ||
Checkpoints can be found in ```checkpoints``` folder. Checkpoints are large, therefore we use [git lsf](https://git-lfs.github.com/) to store them. Either use ```git lfs pull``` or download checkpoints manually from github. | ||
|
||
### Animation Demo | ||
To run a demo, download a checkpoint and run the following command: | ||
```bash | ||
python demo.py --config config/dataset_name.yaml --driving_video path/to/driving --source_image path/to/source --checkpoint path/to/checkpoint | ||
``` | ||
The result will be stored in ```result.mp4```. To use Animation via Disentaglemet add ```--mode avd```, for standard animation add ```--mode standard``` instead. | ||
|
||
### Colab Demo | ||
We prepared a demo runnable in google-colab, see: ```demo.ipynb```. | ||
|
||
|
||
### Training | ||
|
||
To train a model run: | ||
```bash | ||
CUDA_VISIBLE_DEVICES=0 python run.py --config config/dataset_name.yaml --device_ids 0 | ||
``` | ||
The code will create a folder in the log directory (each run will create a time-stamped new folder). Checkpoints will be saved to this folder. | ||
To check the loss values during training see ```log.txt```. | ||
You can also check training data reconstructions in the ```train-vis``` subfolder. | ||
Then to train **Animation via disentaglement (AVD)** use: | ||
|
||
```bash | ||
CUDA_VISIBLE_DEVICES=0 python run.py --checkpoint log/{folder}/cpk.pth --config config/dataset_name.yaml --device_ids 0 --mode train_avd | ||
``` | ||
Where ```{folder}``` is the name of the folder created in the previous step. (Note: use backslash '\' before space.) | ||
This will use the same folder where checkpoint was previously stored. | ||
It will create a new checkpoint containing all the previous models and the trained avd_network. | ||
You can monitor performance in log file and visualizations in train-vis folder. | ||
|
||
### Evaluation on video reconstruction | ||
|
||
To evaluate the reconstruction performance run: | ||
```bash | ||
CUDA_VISIBLE_DEVICES=0 python run.py --config config/dataset_name.yaml --mode reconstruction --checkpoint log/{folder}/cpk.pth | ||
``` | ||
Where ```{folder}``` is the name of the folder created in the previous step. (Note: use backslash '\' before space.) | ||
The ```reconstruction``` subfolder will be created in the checkpoint folder. | ||
The generated video will be stored to this folder, also generated videos will be stored in ```png``` subfolder in loss-less '.png' format for evaluation. | ||
Instructions for computing metrics from the paper can be found [here](https://github.com/AliaksandrSiarohin/pose-evaluation). | ||
|
||
### TED dataset | ||
For obtaining TED dataset run the following commands: | ||
```bash | ||
git clone https://github.com/AliaksandrSiarohin/video-preprocessing | ||
cd video-preprocessing | ||
python load_videos.py --metadata ../data/ted384-metadata.csv --format .mp4 --out_folder ../data/TED384-v2 --workers 8 --image_shape 384,384 | ||
``` | ||
|
||
### Training on your own dataset | ||
1) Resize all the videos to the same size, e.g 256x256, the videos can be in '.gif', '.mp4' or folder with images. | ||
We recommend the latter, for each video make a separate folder with all the frames in '.png' format. This format is loss-less, and it has better i/o performance. | ||
|
||
2) Create a folder ```data/dataset_name``` with 2 subfolders ```train``` and ```test```, put training videos in the ```train``` and testing in the ```test```. | ||
|
||
3) Create a config file ```config/dataset_name.yaml```. See description of the parameters in the ```config/vox256.yaml```. Specify the dataset root in dataset_params specify by setting ```root_dir: data/dataset_name```. Adjust other parameters as desired, such as the number of epochs for example. Specify ```id_sampling: False``` if you do not want to use id_sampling. | ||
|
||
|
||
#### Additional notes | ||
|
||
Citation: | ||
``` | ||
@inproceedings{siarohin2021motion, | ||
author={Siarohin, Aliaksandr and Woodford, Oliver and Ren, Jian and Chai, Menglei and Tulyakov, Sergey}, | ||
title={Motion Representations for Articulated Animation}, | ||
booktitle = {CVPR}, | ||
year = {2021} | ||
} | ||
``` | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 1,104 @@ | ||
""" | ||
Copyright Snap Inc. 2021. This sample code is made available by Snap Inc. for informational purposes only. | ||
No license, whether implied or otherwise, is granted in or to such code (including any rights to copy, modify, | ||
publish, distribute and/or commercialize such code), unless you have entered into a separate agreement for such rights. | ||
Such code is provided as-is, without warranty of any kind, express or implied, including any warranties of merchantability, | ||
title, fitness for a particular purpose, non-infringement, or that such code is free of defects, errors or viruses. | ||
In no event will Snap Inc. be liable for any damages or losses of any kind arising from the sample code or your use thereof. | ||
""" | ||
|
||
import os | ||
from tqdm import tqdm | ||
|
||
import torch | ||
from torch.utils.data import DataLoader | ||
|
||
from frames_dataset import PairedDataset | ||
from logger import Logger, Visualizer | ||
import imageio | ||
from scipy.spatial import ConvexHull | ||
import numpy as np | ||
|
||
from sync_batchnorm import DataParallelWithCallback | ||
|
||
|
||
def get_animation_region_params(source_region_params, driving_region_params, driving_region_params_initial, | ||
mode='standard', avd_network=None, adapt_movement_scale=True): | ||
assert mode in ['standard', 'relative', 'avd'] | ||
new_region_params = {k: v for k, v in driving_region_params.items()} | ||
if mode == 'standard': | ||
return new_region_params | ||
elif mode == 'relative': | ||
source_area = ConvexHull(source_region_params['shift'][0].data.cpu().numpy()).volume | ||
driving_area = ConvexHull(driving_region_params_initial['shift'][0].data.cpu().numpy()).volume | ||
movement_scale = np.sqrt(source_area) / np.sqrt(driving_area) | ||
|
||
shift_diff = (driving_region_params['shift'] - driving_region_params_initial['shift']) | ||
shift_diff *= movement_scale | ||
new_region_params['shift'] = shift_diff source_region_params['shift'] | ||
|
||
affine_diff = torch.matmul(driving_region_params['affine'], | ||
torch.inverse(driving_region_params_initial['affine'])) | ||
new_region_params['affine'] = torch.matmul(affine_diff, source_region_params['affine']) | ||
return new_region_params | ||
elif mode == 'avd': | ||
new_region_params = avd_network(source_region_params, driving_region_params) | ||
return new_region_params | ||
|
||
|
||
def animate(config, generator, region_predictor, avd_network, checkpoint, log_dir, dataset): | ||
animate_params = config['animate_params'] | ||
log_dir = os.path.join(log_dir, 'animation') | ||
|
||
dataset = PairedDataset(initial_dataset=dataset, number_of_pairs=animate_params['num_pairs']) | ||
dataloader = DataLoader(dataset, batch_size=1, shuffle=False, num_workers=1) | ||
|
||
if checkpoint is not None: | ||
Logger.load_cpk(checkpoint, generator=generator, region_predictor=region_predictor, | ||
avd_network=avd_network) | ||
else: | ||
raise AttributeError("Checkpoint should be specified for mode='animate'.") | ||
|
||
if not os.path.exists(log_dir): | ||
os.makedirs(log_dir) | ||
|
||
if torch.cuda.is_available(): | ||
generator = DataParallelWithCallback(generator) | ||
region_predictor = DataParallelWithCallback(region_predictor) | ||
avd_network = DataParallelWithCallback(avd_network) | ||
|
||
generator.eval() | ||
region_predictor.eval() | ||
avd_network.eval() | ||
|
||
for it, x in tqdm(enumerate(dataloader)): | ||
with torch.no_grad(): | ||
visualizations = [] | ||
|
||
driving_video = x['driving_video'] | ||
source_frame = x['source_video'][:, :, 0, :, :] | ||
|
||
source_region_params = region_predictor(source_frame) | ||
driving_region_params_initial = region_predictor(driving_video[:, :, 0]) | ||
|
||
for frame_idx in range(driving_video.shape[2]): | ||
driving_frame = driving_video[:, :, frame_idx] | ||
driving_region_params = region_predictor(driving_frame) | ||
new_region_params = get_animation_region_params(source_region_params, driving_region_params, | ||
driving_region_params_initial, | ||
mode=animate_params['mode'], | ||
avd_network=avd_network) | ||
out = generator(source_frame, source_region_params=source_region_params, | ||
driving_region_params=new_region_params) | ||
|
||
out['driving_region_params'] = driving_region_params | ||
out['source_region_params'] = source_region_params | ||
out['new_region_params'] = new_region_params | ||
|
||
visualization = Visualizer(**config['visualizer_params']).visualize(source=source_frame, | ||
driving=driving_frame, out=out) | ||
visualizations.append(visualization) | ||
|
||
result_name = "-".join([x['driving_name'][0], x['source_name'][0]]) | ||
image_name = result_name animate_params['format'] | ||
imageio.mimsave(os.path.join(log_dir, image_name), visualizations) |
Oops, something went wrong.