Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…

Python 4,319 368 Updated Jul 25, 2024

RVC-Boss / GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Python 29,734 3,436 Updated Jul 25, 2024

hubertsiuzdak / snac

Multi-Scale Neural Audio Codec (SNAC) compresses audio into discrete codes at a low bitrate

Python 251 17 Updated Apr 9, 2024

Takaaki-Saeki / DiscreteSpeechMetrics

Reference-aware automatic speech evaluation toolkit

Python 80 5 Updated Feb 22, 2024

Stability-AI / stable-audio-tools

Generative models for conditional audio generation

Python 2,351 210 Updated Jul 15, 2024

Stability-AI / stable-audio-metrics

Metrics for evaluating music and audio generative models – with a focus on long-form, full-band, and stereo generations.

Python 117 13 Updated Jul 25, 2024

rsxdalv / tts-generation-webui

TTS Generation Web UI (Bark, MusicGen AudioGen, Tortoise, RVC, Vocos, Demucs, SeamlessM4T, MAGNet, StyleTTS2, MMS)

TypeScript 1,513 160 Updated Jul 25, 2024

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 23,640 3,385 Updated Jul 25, 2024

Plachtaa / VALL-E-X

An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io

Python 7,456 739 Updated Feb 11, 2024

facebookresearch / SONAR

SONAR, a new multilingual and multimodal fixed-size sentence embedding space, with a full suite of speech and text encoders and decoders.

Python 292 32 Updated Jul 25, 2024

haoheliu / AudioLDM2

Text-to-Audio/Music Generation

Python 2,163 174 Updated Jun 27, 2024

lucidrains / voicebox-pytorch

Implementation of Voicebox, new SOTA Text-to-speech network from MetaAI, in Pytorch

Python 561 45 Updated Feb 16, 2024

Anjok07 / ultimatevocalremovergui

GUI for a Vocal Remover that uses Deep Neural Networks.

Python 16,783 1,257 Updated May 23, 2024

w-okada / voice-changer

リアルタイムボイスチェンジャー Realtime Voice Changer

Python 15,494 1,674 Updated Jul 24, 2024

rhasspy / gruut

A tokenizer, text cleaner, and phonemizer for many human languages.

Python 263 35 Updated Jul 3, 2024

hugofloresgarcia / vampnet

music generation with masked transformers!

Jupyter Notebook 274 35 Updated Jul 20, 2024

bshall / soft-vc

Soft speech units for voice conversion

Jupyter Notebook 391 32 Updated Mar 14, 2024

YuanGongND / whisper-at

Code and Pretrained Models for Interspeech 2023 Paper "Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong Audio Event Taggers"

Python 303 24 Updated Feb 21, 2024

gpt-engineer-org / gpt-engineer

Specify what you want it to build, the AI asks for clarification, and then builds it.

Python 51,482 6,695 Updated Jul 25, 2024

vocodedev / vocode-core

🤖 Build voice-based LLM agents. Modular open source.

Python 2,558 430 Updated Jul 25, 2024

descriptinc / descript-audio-codec

State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.

Python 1,044 91 Updated Jul 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kim Nguyen huukim136

Achievements

Achievements

Block or report huukim136

Stars

NVIDIA / audio-flamingo

lucidrains / e2-tts-pytorch

yl4579 / StyleTTS2

Camb-ai / MARS5-TTS

glory20h / VoiceLDM

luosiallen / latent-consistency-model

segmind / distill-sd

descriptinc / audiotools

resemble-ai / resemble-enhance

open-mmlab / Amphion