Cantonese/Chinese Text to Speech based on statistical parametric speech synthesis using merlin toolkit
This project is influenced by MTTS
- First, you need data contain wav and txt (prosody mark is optional)
- Second, generate HTS label using this project
- Using merlin/egs/cantonese_voice to train and generate Cantonese Voice
Python : python3.6
System: linux(tested on ubuntu16.04)
sudo apt-get install libatlas3-base
Run bash tools/install_mtts.sh
Or download file by yourself
- Download montreal-forced-aligner and unzip to directory tools/
Run Demo
bash run_demo.sh
- Usage: Run
python src/mtts.py txtfile wav_directory_path output_directory_path
(Absolute path or relative path) Then you will get HTS label, if you have your own acoustic model trained by monthreal-forced-aligner, add-a your_acoustic_model.zip
, otherwise, this project use thchs30.zip acoustic model as default - Attention: Currently only support Chinese Character, txt should not have any Arabia number or English alphabet
txtfile example
A_01 这是一段文本
A_02 这是第二段文本
wav_directory example(Sampleing Rate should larger than 16khz)
A_01.wav
A_02.wav
- Usage: Run
python src/mandarin_frontend.py txtfile output_directory_path
- or import mandarin_frontend
from mandarin_frontend import txt2label
result = txt2label('向香港特别行政区同胞澳门和台湾同胞海外侨胞')
[print(line) for line in result]
see source code for more information, but pay attention to the alignment file(sfs file), the format is endtime phone_type
not start_time, phone_type
(which is different from speech ocean's data)
This project use Montreal-Forced-Aligner to do forced alignment, if you want to get a better alignment, use your data to train a alignment-model, see mfa: algin-using-only-the-dataset
- We trained the acoustic model on our dataset.
You can generate HTS Label without prosody mark. we assume that word segment is smaller than prosodic word(which is adjusted in code)
- Text Normalization
- Better Chinese word segment
- G2P: Polyphone Problem
- Better Label format and Question Set
- Improvement of prosody analyse
- Better alignment
- miran899