Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot reproduce reported SDR & retrain the speaker embedding #30

Open
nnbtam opened this issue Jan 27, 2022 · 0 comments
Open

Cannot reproduce reported SDR & retrain the speaker embedding #30

nnbtam opened this issue Jan 27, 2022 · 0 comments

Comments

@nnbtam
Copy link

nnbtam commented Jan 27, 2022

Hello, I have two questions about the implementation.

  1. I cannot reproduce the results reported in the README.
    I have trained for around > 400k steps on Librispeech 360h 100h clean dataset, using the embedder provided in this repo.
    However, I can only obtain up to a maximum SDR of 5.5.

To obtain data from the Librispeech 360h 100h, I generate the mixed audios for 360h and 100h separately, then add them together in another folder. Is this the right way when I want to use more data to train the voice filter module?

  1. I got worse results when retraining the speaker embedding
    I retrained the embedder using the following repo: Speaker verification on 3 datasets: Librispeech, VoxCeleb1, VoxCeleb2.

Theoretically, I expect the voice filter module will benefit from the embedder trained on more data, but the results got even worse. Can you share how you train this embedder?

Thank you in advance!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant