S2VT: Sequence to Sequence: Video to Text

Note

This repository is not being actively maintained due to lack of time and interest. My sincerest apologies to the open source community for allowing this project to stagnate. I hope it was useful for some of you as a jumping-off point.

Acknowledgement

I modified the code from jazzsaxmafia, and I have fixed some problems in his code.

Requirement

Tensorflow 0.12
Keras

How to use my code

First, download MSVD dataset, and extract video features:

$ python extract_feats.py

After this operation, you should split the features into two parts:

train_features
test_features

Second, train the model:

$ CUDA_VISIBLE_DEVICES=0 ipython

When in the ipython environment, then:

>>> import model_rgb
>>> model_rgb.train()

You should change the training parameters and directory path in the model_rgb.py

Third, test the model, choose a trained model, then:

>>> import model_rgb
>>> model_rgb.test()

After testing, a text file, "S2VT_results.txt" will generated.

Last, evaluate results with COCO

We evaluate the generation results with coco-caption tools.

You can run the shell get_coco_tools.sh get download the coco tools:

$ ./get_coco_tools.sh

After this, generate the reference json file from ground truth CSV file:

$ python create_reference.py

Then, generate the results json file from S2VT_results.txt file:

$ python create_result_json.py

Finally, you can evaluate the generation results:

$ python eval.py

Results

Model	METEOR
S2VT(ICCV 2015)
-RGB(VGG)	29.2
-Optical Flow(AlexNet)	24.3
Our model
-RGB(VGG)	28.1
-Optical Flow(AlexNet)	23.3

Attention

Please feel free to ask me if you have questions.
I only commit the RGB parts of all my code, you can modify the code to use optical flow features.

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
coco_eval		coco_eval
data		data
loss_imgs		loss_imgs
README.md		README.md
cnn_util.py		cnn_util.py
extract_RGB_feats.py		extract_RGB_feats.py
model_RGB.py		model_RGB.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

S2VT: Sequence to Sequence: Video to Text

Note

Acknowledgement

Requirement

How to use my code

First, download MSVD dataset, and extract video features:

Second, train the model:

Third, test the model, choose a trained model, then:

Last, evaluate results with COCO

Results

Attention

About

Releases

Packages

Languages

bei21/S2VT

Folders and files

Latest commit

History

Repository files navigation

S2VT: Sequence to Sequence: Video to Text

Note

Acknowledgement

Requirement

How to use my code

First, download MSVD dataset, and extract video features:

Second, train the model:

Third, test the model, choose a trained model, then:

Last, evaluate results with COCO

Results

Attention

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages