NU-Wave: A Diffusion Probabilistic Model for Neural Audio Upsampling
Junhyeok Lee, Seungu Han @ MINDsLab Inc., SNU
Paper(arXiv): https://arxiv.org/abs/2104.02321 (Accepted to INTERSPEECH 2021)
Audio Samples: https://mindslab-ai.github.io/nuwave
Official Pytorch Lightning Implementation for NU-Wave.
Update: CODE RELEASED! README is still updating.
TODO: How to preprocessing/ training/ evaluation
TODO
TODO
run trainer.py
TODO
run for_test.py
or test.py
.
├── Dockerfile
├── dataloader.py # Dataloader for train/val(=test)
├── filters.py # Filter implementation
├── test.py # Test with lightning_loop.
├── for_test.py # Test with for_loop. Recommended due to device dependency of lightning
├── hparameter.yaml # Config
├── lightning_model.py # NU-Wave implementation. DDPM is based on ivanvok's WaveGrad implementation
├── model.py # NU-Wave model based on lmnt-com's DiffWave implementation
├── requirement.txt # requirement libraries
├── sampling.py # Sampling a file
├── trainer.py # Lightning trainer
├── README.md
├── utils
│ ├── stft.py # STFT layer
│ ├── tblogger.py # Tensorboard Logger for lightning
│ └── wav2pt.py # Preprocessing
└── docs # For github.io
└─ ...
Pytorch >=1.7.0 for nn.SiLU(swish) Pytorch-Lightning==1.1.6 The requirements are highlighted in requirements.txt. We also provide docker setup Dockerfile.
This implementation uses code from following repositories:
- J.Ho's official DDPM implementation
- lucidrain's DDPM pytorch implementation
- ivanvok's WaveGrad pytorch implementation
- lmnt-com's DiffWave pytorch implementation
This README and the webpage for the audio samples are inspired by:
- Tips for Publishing Research Code
- Audio samples webpage of DCA
- Cotatron
- Audio samples wabpage of WaveGrad
The audio samples on our webpage are partially derived from:
- VCTK: 46 hours of English speech from 108 speakers.
If this repository useful for your research, please consider citing! Bibtex will be updated after INTERSPEECH 2021 conference.
@article{lee2021nuwave,
title={NU-Wave: A Diffusion Probabilistic Model for Neural Audio Upsampling},
author={Lee, Junhyeok and Han, Seungu},
journal={arXiv preprint arXiv:2104.02321},
year={2021}
}
If you have a question or any kind of inquiries, please contact Junhyeok Lee at [email protected]