RMSnow

Xueyao Zhang RMSnow

Ph.D. student at The Chinese University of Hong Kong, Shenzhen (CUHK-Shenzhen)

115 followers · 55 following

The Chinese University of Hong Kong, Shenzhen (CUHK-Shenzhen)
https://www.zhangxueyao.com/

Achievements

Highlights

Stars

Linear95 / CLUB

Code for ICML2020 paper - CLUB: A Contrastive Log-ratio Upper Bound of Mutual Information

Jupyter Notebook 305 39 Updated May 10, 2024

kyutai-labs / moshi

Python 5,824 438 Updated Sep 30, 2024

ddlBoJack / emotion2vec

[ACL 2024] Official PyTorch code for extracting features and training downstream models with emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation

Python 587 42 Updated Sep 9, 2024

google / visqol

Perceptual Quality Estimator for speech and audio

C 682 124 Updated Aug 2, 2024

huggingface / parler-tts

Inference and training library for high-quality TTS models.

Python 4,266 428 Updated Sep 23, 2024

ictnlp / LLaMA-Omni

LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.

Python 2,075 118 Updated Sep 24, 2024

justinlovelace / SESD

27 Updated Sep 1, 2024

FunAudioLLM / CosyVoice

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Python 5,161 528 Updated Sep 29, 2024

nicolaus625 / FM4Music

The official GitHub page for the survey paper "Foundation Models for Music: A Survey".

83 3 Updated Sep 4, 2024

facebookresearch / WavAugment

A library for speech data augmentation in time-domain

Python 635 57 Updated Aug 30, 2021

trinhtuanvubk / Diff-VC

Diffusion Model for Voice Conversion

Jupyter Notebook 36 6 Updated Mar 14, 2024

SilasAntonisen / PolySinger

PolySinger: Singing-Voice to Singing-Voice Translation From English to Japanese

3 Updated Jul 8, 2024

open-mmlab / FoleyCrafter

FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds. AI拟音大师，给你的无声视频添加生动而且同步的音效 😝

Python 418 37 Updated Jul 26, 2024

3loi / NaturalVoices

Jupyter Notebook 41 3 Updated Sep 4, 2024

Yuan-ManX / ai-audio-datasets

AI Audio Datasets (AI-ADS) 🎵, including Speech, Music, and Sound Effects, which can provide training data for Generative AI, AIGC, AI model training, intelligent audio tool development, and audio a…

472 33 Updated Sep 6, 2024

zyushun / Adam-mini

Code for Adam-mini: Use Fewer Learning Rates To Gain More https://arxiv.org/abs/2406.16793

Python 294 10 Updated Sep 18, 2024

HLTSingapore / Emotional-Speech-Data

This is the GitHub page for publicly available emotional speech data.

316 22 Updated Jan 6, 2022

AlanBaade / DisentangledNCLM

Public Code for Neural Codec Language Models for Disentangled and Textless Voice Conversion (Interspeech 2024)

2 Updated Jun 6, 2024

yangdongchao / LLM-Codec

The open source code for LLM-Codec

Python 108 4 Updated Aug 18, 2024

line / LibriTTS-P

LibriTTS-P: A Corpus with Speaking Style and Speaker Identity Prompts for Text-to-Speech and Style Captioning

110 2 Updated Jun 13, 2024

BytedanceSpeech / seed-tts-eval

Python 953 97 Updated Jun 14, 2024

2noise / ChatTTS

A generative speech model for daily dialogue.

Python 31,126 3,380 Updated Sep 21, 2024

interactiveaudiolab / penn

Pitch Estimating Neural Networks (PENN)

Python 229 21 Updated Jul 31, 2024

ZhangXInFD / SpeechTokenizer

This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples are presented on

Python 431 39 Updated Jun 9, 2024

descriptinc / descript-audio-codec

State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.

Python 1,135 103 Updated Jul 11, 2024

Alpha-VLLM / Lumina-T2X

Lumina-T2X is a unified framework for Text to Any Modality Generation

Python 2,036 86 Updated Aug 6, 2024

ICTMCG / LLM-for-misinformation-research

Paper list of misinformation research using (multi-modal) large language models, i.e., (M)LLMs.

112 6 Updated Sep 10, 2024

astral-sh / ruff

An extremely fast Python linter and code formatter, written in Rust.

Rust 31,398 1,047 Updated Sep 30, 2024

openai / sparse_attention

Examples of using sparse attention, as in "Generating Long Sequences with Sparse Transformers"

Python 1,513 192 Updated Aug 12, 2020

mushanshanshan / ESLTTS

ESLTTS dataset

15 1 Updated Jun 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Xueyao Zhang RMSnow

Achievements

Achievements

Highlights

Block or report RMSnow

Stars

Linear95 / CLUB

kyutai-labs / moshi

ddlBoJack / emotion2vec

google / visqol

huggingface / parler-tts

ictnlp / LLaMA-Omni

justinlovelace / SESD

FunAudioLLM / CosyVoice

nicolaus625 / FM4Music

facebookresearch / WavAugment

trinhtuanvubk / Diff-VC

SilasAntonisen / PolySinger

open-mmlab / FoleyCrafter

3loi / NaturalVoices

Yuan-ManX / ai-audio-datasets

zyushun / Adam-mini

HLTSingapore / Emotional-Speech-Data

AlanBaade / DisentangledNCLM

yangdongchao / LLM-Codec

line / LibriTTS-P

BytedanceSpeech / seed-tts-eval

2noise / ChatTTS

interactiveaudiolab / penn

ZhangXInFD / SpeechTokenizer

descriptinc / descript-audio-codec

Alpha-VLLM / Lumina-T2X

ICTMCG / LLM-for-misinformation-research

astral-sh / ruff

openai / sparse_attention

mushanshanshan / ESLTTS