-
Notifications
You must be signed in to change notification settings - Fork 978
Video Transcription for the SubtitleSwitcher Plugin
SubtitleSwitcher is a powerful plugin for the AVideo platform. It allows users to upload or manually create subtitles, supporting .srt and .vtt files. In addition to these features, SubtitleSwitcher integrates with the offline speech recognition toolkit Vosk, enabling automatic transcription of videos and audios. This tutorial will guide you through the installation and configuration of the SubtitleSwitcher plugin and the Vosk transcriber.
Upon installation of the SubtitleSwitcher plugin, several options are available for customization:
defaultSubtitleLanguageUseUserLocation: If enabled, the default subtitle language is based on the user's location.
Once the plugin is enabled, two new options will appear in the video manager - 'Create Subtitle' and 'Upload Subtitle'. The 'Upload Subtitle' option allows for uploading a .srt or .vtt file. The 'Create Subtitle' option opens an editor where you can set the start and end times for each subtitle text segment and edit the subtitle text.
Vosk is an offline speech recognition toolkit that supports multiple languages. The Vosk-transcriber is a Python package that utilizes the Vosk library to transcribe audio files. By installing Vosk-transcriber in Ubuntu, you can enable the SubtitleSwitcher plugin to automatically transcribe videos and audio. This will allow you to generate accurate subtitles for your videos in various languages without the need for an internet connection. Follow this step-by-step guide to install Vosk-transcriber in Ubuntu and enhance the functionality of the subtitleSwitcher plugin with speech recognition capabilities.
Before installing new packages, it's a good practice to update your system. Open a terminal and run the following command:
sudo apt update && sudo apt upgrade
The easiest way to install vosk API is with pip. You do not have to compile anything.
- Python version: 3.5-3.9
- pip version: 20.3 and newer.
pip3 install vosk
For more information check https://alphacephei.com/vosk/install
Vosk requires a language model to perform speech recognition. Download a pre-built language model from the Vosk website (https://alphacephei.com/vosk/models) or the GitHub repository (https://github.com/alphacep/vosk-api/blob/master/doc/models.md). For example, to download the English language model, run:
wget https://alphacephei.com/vosk/models/vosk-model-small-en-us-0.15.zip
Unzip the downloaded language model:
unzip vosk-model-small-en-us-0.15.zip