Foreign-Whispers

Sections

About

Foreign Whispers is a tool designed to transform your video content by adding both spoken and written subtitles in Spanish, all while replicating the original voices. This powerful application harnesses cutting-edge AI technologies to provide a seamless and engaging viewing experience for diverse audiences.

Youtube Video Download: Automatically downloads YouTube videos to use as input for subtitle and voice replication, making the process straightforward and efficient.
Speaker Diarization: Leverages speaker diarization technology to accurately identify and separate different speakers in your video, ensuring precise voice replication and subtitle alignment.
Voice Cloning and Translation: Clones voices and translates spoken content into Spanish, making your content accessible to Spanish-speaking audiences.
Video Compliation: Integrates the translated audio and subtitles into the original video. This process ensures that the final output maintains the video’s original flow and visual coherence while incorporating the new language features. The tool manages transitions, synchronization, and overall presentation to deliver a final product.

Approaches

Audio stetching/shrinking

To match the video length, the script speeds up the audio in specific sections. This approach ensures that the translated speech fits within the given timestamps for each speaker's segment.

Frame adding/deleting

To allow the voice playback to play naturally, the script determines time interval differences and triggers markers for additional frames that need to be inserted or removed.

Issues

Audio Speed

Adjusting audio speed can result in unnatural sound, making it hard to understand.

Audio Artifacts

Background noise and model variability can introduce random audio artifacts, affecting the clarity and quality of the output.

Example Outputs

Audio Speed Manipulation

Demo.Vid.mp4

Audio Speed Frame Manipulation

final_video.mp4

How to use:

Before running the script, ensure you have Python installed on your machine. You can download Python from the official website.

Clone the repository to your local machine

git clone https://github.com/RobCaamano/foreign-whispers.git

Navigate to the project directory and download 'requirements.txt'. It is recommended to do this in a virtual environment. For information about creating and using conda environments, visit their site.

cd '[path to dir]'
pip install -r requirements.txt

This script is best used with a GPU for acceleration. To do this, you need to have a GPU with CUDA support and the appropriate CUDA toolkit installed. Follow the CUDA installation guide for details.

Note: The time required for the script to complete depends on the computational power of your GPU.

Run the script in your virtual environment, providing the url through command line argument

python ./main.py "[url]"

Hugging Face Space

Our Hugging Face Space provides an accessible interface for experimenting with our models. Please note that this space is configured to run on CPU rather than GPU. For optimal performance and faster processing, we recommend cloning the repository and running the models locally on your own hardware.

Coqui Terms of Service

By using this project, you agree to the Coqui Terms of Service.

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
README.md		README.md
app.py		app.py
diarization.py		diarization.py
main.py		main.py
opus.py		opus.py
requirements.txt		requirements.txt
translated_video.py		translated_video.py
tts.py		tts.py
video_to_text.py		video_to_text.py
yt_download.py		yt_download.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Foreign-Whispers

Sections

About

Approaches

Issues

Example Outputs

Audio Speed Manipulation

Audio Speed Frame Manipulation

How to use:

Hugging Face Space

Coqui Terms of Service

About

Releases

Packages

Languages

RobCaamano/Foreign-Whispers

Folders and files

Latest commit

History

Repository files navigation

Foreign-Whispers

Sections

About

Approaches

Issues

Example Outputs

Audio Speed Manipulation

Audio Speed Frame Manipulation

How to use:

Hugging Face Space

Coqui Terms of Service

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages