Skip to content

SkyScript-100M: 1,000,000,000 Pairs of Scripts and Shooting Scripts for Short Drama: https://arxiv.org/abs/2408.09333v2

License

Notifications You must be signed in to change notification settings

vaew/SkyScript-100M

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SkyScript-100M: 1,000,000,000 Pairs of Scripts and Shooting Scripts for Short Drama

Jing Tang 1Quanlu Jia 2Yuqiang Xie 2†Zeyu Gong 1†Xiang Wen 2
Jiayi Zhang 2Yalong Guo 2Guibin Chen 2 Jiangping Yang 2
Corresponding author

showcase

Introduction 📖

Generating high-quality shooting scripts containing information such as scene and shot language is essential for short drama script generation. We collect 6,660 popular short drama episodes from the Internet, each with an average of 100 short episodes, and the total number of short episodes is about 80,000, with a total duration of about 2,000 hours and totaling 10 terabytes (TB). We perform keyframe extraction and annotation on each episode to obtain about 10,000,000 shooting scripts. We perform 100 script restorations on the extracted shooting scripts based on our self-developed large short drama generation model SkyReels. This leads to a dataset containing 1,000,000,000 pairs of scripts and shooting scripts for short dramas, called SkyScript-100M. We compare SkyScript-100M with the existing dataset in detail and demonstrate some deeper insights that can be achieved based on SkyScript-100M. Based on SkyScript-100M, researchers can achieve several deeper and more far-reaching script optimization goals, which may drive a paradigm shift in the entire field of text-to-video and significantly advance the field of short drama video generation.

🔥 We have uploaded some sample data for early research!

🔥 SkyReels built on our dataset is available!

📖 Technical Report is available!

Acknowledgements 💐

We would like to thank the contributors of GroundingDINO, DeepFace, AlphaPose repositories, for their open research and contributions.

Citation 💖

If you find SkyScript-100M useful for your research, welcome to 🌟 this repo and cite our work using the following BibTeX:

@misc{tang2024skyscript100m,
      title={SkyScript-100M: 1,000,000,000 Pairs of Scripts and Shooting Scripts for Short Drama}, 
      author={Jing Tang, Quanlu Jia, Yuqiang Xie, Zeyu Gong, Xiang Wen, Jiayi Zhang, Yalong Guo, Guibin Chen, Jiangping Yang},
      year={2024},
      eprint={2408.09333v2},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Contact 📧

Jing Tang (唐晶): [email protected]

About

SkyScript-100M: 1,000,000,000 Pairs of Scripts and Shooting Scripts for Short Drama: https://arxiv.org/abs/2408.09333v2

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published