Noise-Aware Speech Separation (NASS)

NOTE: This paper has been accepted by ICASSP 2024!

This repository provides the examples of Sepformer (NASS) on Libri2Mix based on SpeechBrain.

Install with GitHub

Once you have created your Python environment (Python 3.7 ) you can simply type:

git clone https://github.com/TzuchengChang/NASS
cd NASS/speechbrain
pip install -r requirements.txt
pip install --editable .
pip install mir-eval
pip install pyloudnorm

Introduction

Fig1. The overall pipeline of NASS. $x_n$ and $\hat n$ denote the noisy input and predicted noise. $\hat{s_1}$ and $\hat{s_2}$ are separated speech while ${s_1}$ and ${s_2}$ are the ground-truth. $h_{\hat {s_1}}$, $h_{\hat {s_2}}$ and $h_{\hat n}$ in dashed box are predicted representations, while $h_{s_1}$ and $h_{s_2}$ in solid box are the ground-truth. "P" denotes the mutual information between separated and ground-truth speech is maximized while "N" denotes the mutual information between separated speech and noise is minimized.

Fig2. The illustration of patch-wise contrastive learning. For the $i$-th sampling of $K$ times, one query example $r^i_q$, positive example $r^i_p$ and $M$ negative examples ${r_n^{i,j}}$ ($j \in [1,M]$) are sampled from predicted speech representation $h_{\hat s_a}$, ground-truth speech representation $h_{s_a}$ and predicted noise representation $h_{\hat n}$, respectively, "CS" denotes cosine similarity.

Fig3. Spectrum results on Libri2mix with Sepformer. Subplot (a) denotes the mixture; (b), (c) are baseline results; (d), (e), (f) are NASS results. Note that (d) is the noise output.

In this paper, we propose a noise-aware SS (NASS) method, which aims to improve the speech quality for separated signals under noisy conditions. Specifically, NASS views background noise as an additional output and predicts it along with other speakers in a mask-based manner. To effectively denoise, we introduce patch-wise contrastive learning (PCL) between noise and speaker representations from the decoder input and encoder output. PCL loss aims to minimize the mutual information between predicted noise and other speakers at multiple-patch level to suppress the noise information in separated signals. Experimental results show that NASS achieves 1 to 2dB SI-SNRi or SDRi over DPRNN and Sepformer on WHAM! and LibriMix noisy datasets, with less than 0.1M parameter increase.

NASS Example

We also provide a true example from Ted Cruz with -2dB WHAM! noise mixed.

Results are from Sepformer(NASS) trained on Libri2Mix.

Mixture	Speaker 1	Speaker 2	Noise
Download	Download	Download	Download

Run NASS Method

Step1: Prepare datasets. Please refer to LibriMix repository.

Step2: Modify configurations. Configuration files are saved in NASS/recipes/LibriMix/separation/hparams/

Step3: Run NASS method.

cd NASS/speechbrain/recipes/LibriMix/separation/
python train.py hparams/sepformer-libri2mix.yaml --data_folder /yourpath/Libri2Mix/

We also provide a yaml for custom data, and make sure your custom folder structure is like Libri2Mix.

python train.py hparams/sepformer-libri2mix-custom.yaml
 --data_folder /yourpath/custom/

Pretrained Model

We provide a pretrained model on github releases.

To use it, download "results.zip" and unzip it to NASS/recipes/LibriMix/separation/

Then run NASS method.

Cite Our Paper

Please cite our paper and star our repository.

@misc{zhang2024noiseaware,
      title={Noise-Aware Speech Separation with Contrastive Learning}, 
      author={Zizheng Zhang and Chen Chen and Hsin-Hung Chen and Xiang Liu and Yuchen Hu and Eng Siong Chng},
      year={2024},
      eprint={2305.10761},
      archivePrefix={arXiv},
      primaryClass={cs.SD}
}

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
speechbrain		speechbrain
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Noise-Aware Speech Separation (NASS)

Install with GitHub

Introduction

NASS Example

Run NASS Method

Pretrained Model

Cite Our Paper

About

Releases

Packages

Languages

hchen605/NASS

Folders and files

Latest commit

History

Repository files navigation

Noise-Aware Speech Separation (NASS)

Install with GitHub

Introduction

NASS Example

Run NASS Method

Pretrained Model

Cite Our Paper

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages