Skip to content

Anime Word Clouds: Visualization of how frequently each words is pronounced in your favorite show

Notifications You must be signed in to change notification settings

TheRaphael0000/anime_wordclouds

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

35 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Information

Feel free to create pull requests, but do not commit subtitles !

To create a visualization :

  1. Extracts the subtitles using FFMPEG to the VTT format, due to obvious copyright problems, they can't be on the repository.
  2. Preprocess the image using a graphical tool to create a mask.
    • Black: Word cloud space
    • White: Kept as is from the image
    • Grey value: Discarded from the visualization
  3. From this mask and the words obtained from the subtitles, the script uses nltk to remove stop words, wordcloud to create a visualization and a bit of numpy image math's.

List

  1. Neon Genesis Evangelion
  2. Cowboy Bebop
  3. Darling in the Franxx
  4. Mirai Nikki
  5. Death Note
  6. Steins;Gate
  7. One-Punch Man
  8. Bocchi the Rock!
  9. Chainsaw Man

Neon Genesis Evangelion

Data used:

Reddit posts : r/dataisbeautiful r/evangelion

Creation date: 20210122

Cowboy Bebop

Data used:

Reddit posts: r/dataisbeautiful / r/cowboybebop

Creation date: 20210509

Darling in the Franxx

Data used:

Creation date: 20211115

Mirai Nikki

Data used:

Creation date: 20220304

Death Note

Data used:

Creation date: 20220822

Steins;Gate

Data used:

Creation date: 20230720

One-Punch Man

Data used:

Creation date: 20230801

Bocchi the Rock!

Data used:

Creation date: 20241103

Chainsaw Man

Data used:

Creation date: 20241103