kNN-VC

Clone of official kNN-VC, simple kNN-based voice conversion.

Demo
Usage
- Install
- Inference
  - Checkpoints
- Train
Results
Official info
References

Demo

Usage

Install

# Python >=3.10
pip install "torch>=2" "torchaudio>=2" numpy

No kNN-VC install is needed. torch.hub handle everything😉

Inference

import torch, torchaudio

src_wav_path = '<path to arbitrary 16kHz waveform>.wav'
ref_wav_paths = ['<path to arbitrary 16kHz waveform from target speaker>.wav', '<path to 2nd utterance from target speaker>.wav', ...]

knn_vc = torch.hub.load('tarepan/knn-vc-official', 'knn_vc', prematched=True, trust_repo=True, pretrained=True)

query_seq = knn_vc.get_features(src_wav_path)
matching_set = knn_vc.get_matching_set(ref_wav_paths)

out_wav = knn_vc.match(query_seq, matching_set, topk=4)
# out_wav is (T,) tensor converted 16kHz output wav using k=4 for kNN.

Options:

knn_vc.match
- topk: int - Top K
torch.hub.load
- prematched: bool - Whether to use prematched model or non-prematched model

Checkpoints

Under the releases tab of this repo we provide three checkpoints:

Encoder: WavLM (taken from official WavLM)
Vocoder 1: HiFiGAN w/ raw-WavLM-L6
Vocoder 2: HiFiGAN w/ prematched-WavLM-L6

For the HiFiGAN models we provide both the generator inference checkpoint and full training checkpoint with optimizer states.
For performance, see the paper.

Train

Install librosa, tensorboard, matplotlib, fastprogress and scipy.

Precompute WavLM features of the vocoder dataset: we provide a utility for this for the LibriSpeech dataset in prematch_dataset.py:

usage: prematch_dataset.py [-h] --librispeech_path LIBRISPEECH_PATH
                        [--seed SEED] --out_path OUT_PATH [--device DEVICE]
                        [--topk TOPK] [--matching_layer MATCHING_LAYER]
                        [--synthesis_layer SYNTHESIS_LAYER] [--prematch]
                        [--resume]

e.g. (prematch): python prematch_dataset.py --librispeech_path /path/to/librispeech/root --out_path /path/where/you/want/outputs/to/go --topk 4 --matching_layer 6 --synthesis_layer 6 --prematch

Train HiFiGAN: until 2.5M steps

python -m hifigan.train --audio_root_path /path/to/librispeech/root/ --feature_root_path /path/to/the/output/of/previous/step/ --input_training_file data_splits/wavlm-hifigan-train.csv --input_validation_file data_splits/wavlm-hifigan-valid.csv --checkpoint_path /path/where/you/want/to/save/checkpoint --fp16 False --config hifigan/config_v1_wavlm.json --stdout_interval 25 --training_epochs 1800 --fine_tuning

Results

Sample

Demo

Performance

training
- xx [iter/sec] @ NVIDIA A100 on paperspace gradient Notebook (ConvTF32 /AMP )
- take about xx days for whole training
inference
- z.z [sec/sample] @ xx

Official info

HuBERT-Base work well (issue#10)

References

Original paper

@misc{2305.18975,
Author = {Matthew Baas and Benjamin van Niekerk and Herman Kamper},
Title = {Voice Conversion With Just Nearest Neighbors},
Year = {2023},
Eprint = {arXiv:2305.18975},
}

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
.devcontainer		.devcontainer
data_splits		data_splits
hifigan		hifigan
wavlm		wavlm
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
hubconf.py		hubconf.py
knn-vc.png		knn-vc.png
knn_alt.py		knn_alt.py
knnvc_demo.ipynb		knnvc_demo.ipynb
matcher.py		matcher.py
matcher_test.py		matcher_test.py
prematch_dataset.py		prematch_dataset.py
pyproject.toml		pyproject.toml
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

kNN-VC

Demo

Usage

Install

Inference

Checkpoints

Train

Results

Sample

Performance

Official info

References

Original paper

Acknowlegements

About

Releases

Packages

Languages

License

tarepan/knn-vc-official

Folders and files

Latest commit

History

Repository files navigation

kNN-VC

Demo

Usage

Install

Inference

Checkpoints

Train

Results

Sample

Performance

Official info

References

Original paper

Acknowlegements

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages