Skip to content

Video-aided Unsupervised Grammar Induction, NAACL‘21 [best long paper]

License

Notifications You must be signed in to change notification settings

Sy-Zhang/MMC-PCFG

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Video-aided Unsupervised Grammar Induction

We investigate video-aided grammar induction, which learns a constituency parser from both unlabeled text and its corresponding video. We explore rich features (e.g. action, object, scene, audio, face, OCR and speech) from videos, taking the recent Compound PCFG as the baseline. We further propose a Multi-Modal Compound PCFG model (MMC-PCFG) to effectively aggregate these rich features from different modalities.

PDF

News

  • 🍻 Our follow-up work is accepted by EMNLP 2022. [PDF] [Code]
  • 🏆 Our paper wins the best long paper award at NAACL 2021.
  • ☀️ Our paper is accepted by NAACL 2021.

Prerequisites

  • pytorch 1.5.0
  • python 3.7
  • easydict
  • terminaltables
  • tqdm
  • numpy
  • (A forked version) Torch-Struct
  • (Optional) benepar

Quick Start

Please download the data from dropbox or box, and save it to the data folder. Preprocessing details are described here.

Training

Run the following commands for training:

sh scripts/didemo/train.sh 
sh scripts/youcook2/train.sh 
sh scripts/msrvtt/train.sh 

For VC-PCFG and MMC-PCFG, each run will take approximate 2 hours on DiDeMo, 1 hour on YouCook2 and 10 hours on MSRVTT with a single GTX 1080Ti or GTX 2080Ti.

Testing

Our trained model are provided in dropbox or box. Please download them to the final_checkpoints folder.

Then, run the following commands for evaluation:

sh scripts/didemo/test.sh 
sh scripts/youcook2/test.sh 
sh scripts/msrvtt/test.sh 

Preprocessing Details

Download the data from CVPR 2020: Video Pentathlon challenge and save it to the data folder. After that, preprocess sentences with the following scripts:

python tools/preprocess_captions.py
python tools/compute_gold_trees.py
python tools/generate_vocabularies.py

Acknowledgements

This repo is developed based on vpcfg and detr.

Citation

If any part of our paper and code is helpful to your work, please generously cite with:

@InProceedings{zhang2021video,
author = {Zhang, Songyang and Song, Linfeng and Jin, Lifeng and Xu, Kun and Yu, Dong and Luo, Jiebo},
title = {Video-aided Unsupervised Grammar Induction},
booktitle = {NAACL},
year = {2021}
} 

About

Video-aided Unsupervised Grammar Induction, NAACL‘21 [best long paper]

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published