Haoqi Fan*, Bo Xiong*, Karttikeya Mangalam*, Yanghao Li*, Zhicheng Yan, Jitendra Malik, Christoph Feichtenhofer*,
In arXiv, 2104.11227, 2021. [Paper]
To use MViT-B models please refer to the configs under configs/Kinetics
, or see the MODEL_ZOO.md for pre-trained models. See paper for details. For example, the command
python tools/run_net.py \
--cfg configs/Kinetics/MVIT-B.yaml \
DATA.PATH_TO_DATA_DIR path_to_your_dataset \
should train and test a MViT-B model on your dataset.
If you find MViT useful for your research, please consider citing the paper using the following BibTeX entry.
@Article{mvit2021,
author = {Haoqi Fan, Bo Xiong, Karttikeya Mangalam, Yanghao Li, Zhicheng Yan, Jitendra Malik, Christoph Feichtenhofer},
title = {Multiscale Vision Transformers},
journal = {arXiv:2104.11227},
Year = {2021},
}