How to use

text cleaner from https://github.com/CjangCjengh/vits

original repo: https://github.com/jaywalnut310/vits

Online training and inference

colab

See vits-finetuning

How to use

(Suggestion) Python == 3.7

Only Japanese datasets can be used for fine-tuning in this repo.

Clone this repository

git clone https://github.com/SayaSS/vits-finetuning.git

Install requirements

pip install -r requirements.txt

Download pre-trained model

G_0.pth
D_0.pth
Edit "model_dir"(line 152) in utils.py
Put pre-trained models in the "model_dir"/checkpoints

If you need to customize "n_speakers", please replace the pre-trained model with these two.

Create datasets

Speaker ID should be between 0-803.
About 50 audio-text pairs will suffice and 100-600 epochs could have quite good performance, but more data may be better.
Resample all audio to 22050Hz, 16-bit, mono wav files.
Audio files should be >=1s and <=10s.

path/to/XXX.wav|speaker id|transcript

Example

dataset/001.wav|10|こんにちは。

For complete examples, please see filelists/miyu_train.txt and filelists/miyu_val.txt.

Preprocess

python preprocess.py --filelists path/to/filelist_train.txt path/to/filelist_val.txt

Edit "training_files" and "validation_files" in configs/config.json

Build monotonic alignment search

cd monotonic_align
python setup.py build_ext --inplace
cd ..

Train

# Mutiple speakers
python train_ms.py -c configs/config.json -m checkpoints

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
configs		configs
filelists		filelists
monotonic_align		monotonic_align
text		text
wav		wav
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
attentions.py		attentions.py
commons.py		commons.py
data_utils.py		data_utils.py
inference.ipynb		inference.ipynb
losses.py		losses.py
mel_processing.py		mel_processing.py
models.py		models.py
modules.py		modules.py
preprocess.py		preprocess.py
requirements.txt		requirements.txt
train.py		train.py
train_ms.py		train_ms.py
transforms.py		transforms.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Online training and inference

colab

How to use

Clone this repository

Install requirements

Download pre-trained model

If you need to customize "n_speakers", please replace the pre-trained model with these two.

Create datasets

Preprocess

Build monotonic alignment search

Train

About

Releases

Packages

Languages

License

alfonks/vits-finetuning

Folders and files

Latest commit

History

Repository files navigation

Online training and inference

colab

How to use

Clone this repository

Install requirements

Download pre-trained model

If you need to customize "n_speakers", please replace the pre-trained model with these two.

Create datasets

Preprocess

Build monotonic alignment search

Train

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages