Skip to content

Commit

Permalink
Initial commit.
Browse files Browse the repository at this point in the history
  • Loading branch information
Your Name committed Apr 29, 2021
1 parent 4da8841 commit 6b20297
Show file tree
Hide file tree
Showing 34 changed files with 9,094 additions and 0 deletions.
6 changes: 6 additions & 0 deletions LICENSE.md
Original file line number Diff line number Diff line change
@@ -0,0 1,6 @@
Copyright Snap Inc. 2021. This sample code is made available by Snap Inc. for informational purposes only. No license,
whether implied or otherwise, is granted in or to such code (including any rights to copy, modify, publish, distribute
and/or commercialize such code), unless you have entered into a separate agreement for such rights. Such code is
provided as-is, without warranty of any kind, express or implied, including any warranties of merchantability, title,
fitness for a particular purpose, non-infringement, or that such code is free of defects, errors or viruses. In no event
will Snap Inc. be liable for any damages or losses of any kind arising from the sample code or your use thereof.
98 changes: 98 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 1,98 @@
# Motion Representations for Articulated Animation

This repository contains the source code for the CVPR'2021 paper [Motion Representations for Articulated Animation](https://arxiv.org/abs/2104.11280) by [Aliaksandr Siarohin](https://aliaksandrsiarohin.github.io/aliaksandr-siarohin-website/), [Oliver Woodford](https://ojwoodford.github.io/), [Jian Ren](https://alanspike.github.io/), [Menglei Chai](https://mlchai.com/) and [Sergey Tulyakov](http://www.stulyakov.com/).

For more qualitiative examples visit our [project page](https://snap-research.github.io/articulated-animation/).

## Example animation

Here is an example of several images produced by our method. In the first column the driving video is shown. For the remaining columns the top image is animated by using motions extracted from the driving.

![Screenshot](sup-mat/teaser.gif)

### Installation

We support ```python3```. To install the dependencies run:
```bash
pip install -r requirements.txt
```

### YAML configs

There are several configuration files one for each `dataset` in the `config` folder named as ```config/dataset_name.yaml```. See ```config/dataset.yaml``` to get the description of each parameter.

See description of the parameters in the ```config/vox256.yaml```. We adjust the the configuration to run on 1 V100 GPU, training on 256x256 dataset takes approximatly 2 days.

### Pre-trained checkpoints
Checkpoints can be found in ```checkpoints``` folder. Checkpoints are large, therefore we use [git lsf](https://git-lfs.github.com/) to store them. Either use ```git lfs pull``` or download checkpoints manually from github.

### Animation Demo
To run a demo, download a checkpoint and run the following command:
```bash
python demo.py --config config/dataset_name.yaml --driving_video path/to/driving --source_image path/to/source --checkpoint path/to/checkpoint
```
The result will be stored in ```result.mp4```. To use Animation via Disentaglemet add ```--mode avd```, for standard animation add ```--mode standard``` instead.

### Colab Demo
We prepared a demo runnable in google-colab, see: ```demo.ipynb```.


### Training

To train a model run:
```bash
CUDA_VISIBLE_DEVICES=0 python run.py --config config/dataset_name.yaml --device_ids 0
```
The code will create a folder in the log directory (each run will create a time-stamped new folder). Checkpoints will be saved to this folder.
To check the loss values during training see ```log.txt```.
You can also check training data reconstructions in the ```train-vis``` subfolder.
Then to train **Animation via disentaglement (AVD)** use:

```bash
CUDA_VISIBLE_DEVICES=0 python run.py --checkpoint log/{folder}/cpk.pth --config config/dataset_name.yaml --device_ids 0 --mode train_avd
```
Where ```{folder}``` is the name of the folder created in the previous step. (Note: use backslash '\' before space.)
This will use the same folder where checkpoint was previously stored.
It will create a new checkpoint containing all the previous models and the trained avd_network.
You can monitor performance in log file and visualizations in train-vis folder.

### Evaluation on video reconstruction

To evaluate the reconstruction performance run:
```bash
CUDA_VISIBLE_DEVICES=0 python run.py --config config/dataset_name.yaml --mode reconstruction --checkpoint log/{folder}/cpk.pth
```
Where ```{folder}``` is the name of the folder created in the previous step. (Note: use backslash '\' before space.)
The ```reconstruction``` subfolder will be created in the checkpoint folder.
The generated video will be stored to this folder, also generated videos will be stored in ```png``` subfolder in loss-less '.png' format for evaluation.
Instructions for computing metrics from the paper can be found [here](https://github.com/AliaksandrSiarohin/pose-evaluation).

### TED dataset
For obtaining TED dataset run the following commands:
```bash
git clone https://github.com/AliaksandrSiarohin/video-preprocessing
cd video-preprocessing
python load_videos.py --metadata ../data/ted384-metadata.csv --format .mp4 --out_folder ../data/TED384-v2 --workers 8 --image_shape 384,384
```

### Training on your own dataset
1) Resize all the videos to the same size, e.g 256x256, the videos can be in '.gif', '.mp4' or folder with images.
We recommend the latter, for each video make a separate folder with all the frames in '.png' format. This format is loss-less, and it has better i/o performance.

2) Create a folder ```data/dataset_name``` with 2 subfolders ```train``` and ```test```, put training videos in the ```train``` and testing in the ```test```.

3) Create a config file ```config/dataset_name.yaml```. See description of the parameters in the ```config/vox256.yaml```. Specify the dataset root in dataset_params specify by setting ```root_dir: data/dataset_name```. Adjust other parameters as desired, such as the number of epochs for example. Specify ```id_sampling: False``` if you do not want to use id_sampling.


#### Additional notes

Citation:
```
@inproceedings{siarohin2021motion,
author={Siarohin, Aliaksandr and Woodford, Oliver and Ren, Jian and Chai, Menglei and Tulyakov, Sergey},
title={Motion Representations for Articulated Animation},
booktitle = {CVPR},
year = {2021}
}
```

104 changes: 104 additions & 0 deletions animate.py
Original file line number Diff line number Diff line change
@@ -0,0 1,104 @@
"""
Copyright Snap Inc. 2021. This sample code is made available by Snap Inc. for informational purposes only.
No license, whether implied or otherwise, is granted in or to such code (including any rights to copy, modify,
publish, distribute and/or commercialize such code), unless you have entered into a separate agreement for such rights.
Such code is provided as-is, without warranty of any kind, express or implied, including any warranties of merchantability,
title, fitness for a particular purpose, non-infringement, or that such code is free of defects, errors or viruses.
In no event will Snap Inc. be liable for any damages or losses of any kind arising from the sample code or your use thereof.
"""

import os
from tqdm import tqdm

import torch
from torch.utils.data import DataLoader

from frames_dataset import PairedDataset
from logger import Logger, Visualizer
import imageio
from scipy.spatial import ConvexHull
import numpy as np

from sync_batchnorm import DataParallelWithCallback


def get_animation_region_params(source_region_params, driving_region_params, driving_region_params_initial,
mode='standard', avd_network=None, adapt_movement_scale=True):
assert mode in ['standard', 'relative', 'avd']
new_region_params = {k: v for k, v in driving_region_params.items()}
if mode == 'standard':
return new_region_params
elif mode == 'relative':
source_area = ConvexHull(source_region_params['shift'][0].data.cpu().numpy()).volume
driving_area = ConvexHull(driving_region_params_initial['shift'][0].data.cpu().numpy()).volume
movement_scale = np.sqrt(source_area) / np.sqrt(driving_area)

shift_diff = (driving_region_params['shift'] - driving_region_params_initial['shift'])
shift_diff *= movement_scale
new_region_params['shift'] = shift_diff source_region_params['shift']

affine_diff = torch.matmul(driving_region_params['affine'],
torch.inverse(driving_region_params_initial['affine']))
new_region_params['affine'] = torch.matmul(affine_diff, source_region_params['affine'])
return new_region_params
elif mode == 'avd':
new_region_params = avd_network(source_region_params, driving_region_params)
return new_region_params


def animate(config, generator, region_predictor, avd_network, checkpoint, log_dir, dataset):
animate_params = config['animate_params']
log_dir = os.path.join(log_dir, 'animation')

dataset = PairedDataset(initial_dataset=dataset, number_of_pairs=animate_params['num_pairs'])
dataloader = DataLoader(dataset, batch_size=1, shuffle=False, num_workers=1)

if checkpoint is not None:
Logger.load_cpk(checkpoint, generator=generator, region_predictor=region_predictor,
avd_network=avd_network)
else:
raise AttributeError("Checkpoint should be specified for mode='animate'.")

if not os.path.exists(log_dir):
os.makedirs(log_dir)

if torch.cuda.is_available():
generator = DataParallelWithCallback(generator)
region_predictor = DataParallelWithCallback(region_predictor)
avd_network = DataParallelWithCallback(avd_network)

generator.eval()
region_predictor.eval()
avd_network.eval()

for it, x in tqdm(enumerate(dataloader)):
with torch.no_grad():
visualizations = []

driving_video = x['driving_video']
source_frame = x['source_video'][:, :, 0, :, :]

source_region_params = region_predictor(source_frame)
driving_region_params_initial = region_predictor(driving_video[:, :, 0])

for frame_idx in range(driving_video.shape[2]):
driving_frame = driving_video[:, :, frame_idx]
driving_region_params = region_predictor(driving_frame)
new_region_params = get_animation_region_params(source_region_params, driving_region_params,
driving_region_params_initial,
mode=animate_params['mode'],
avd_network=avd_network)
out = generator(source_frame, source_region_params=source_region_params,
driving_region_params=new_region_params)

out['driving_region_params'] = driving_region_params
out['source_region_params'] = source_region_params
out['new_region_params'] = new_region_params

visualization = Visualizer(**config['visualizer_params']).visualize(source=source_frame,
driving=driving_frame, out=out)
visualizations.append(visualization)

result_name = "-".join([x['driving_name'][0], x['source_name'][0]])
image_name = result_name animate_params['format']
imageio.mimsave(os.path.join(log_dir, image_name), visualizations)
Loading

0 comments on commit 6b20297

Please sign in to comment.