Skip to content
View jisang93's full-sized avatar
🔥
🔥
  • Netmarble AI Center
  • Seoul, Republic of Korea

Block or report jisang93

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Beta Lists are currently in beta. Share feedback and report bugs.
Showing results

Code for vec2wav 2.0, a speech token vocoder for VC. Paper: https://arxiv.org/abs/2409.01995

Python 42 4 Updated Nov 7, 2024

Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)

Python 40 3 Updated Nov 1, 2024

SimVQ: Addressing Representation Collapse in Vector Quantized Models with One Linear Layer

Python 76 4 Updated Nov 7, 2024
Python 48 2 Updated Oct 28, 2024

Official PyTorch implementation of the paper "AdaStride: Using Adaptive Strides in Sequential Data for Effective Downsampling"

Python 8 Updated Mar 29, 2024
Python 124 8 Updated Sep 5, 2024

Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)

Python 38 1 Updated Nov 1, 2024

A curated reading list of research in Mixture-of-Experts(MoE).

533 41 Updated Oct 30, 2024

Demo for DART, Audio Imagination workshop submission in NeurIPS 2024

HTML 7 1 Updated Oct 17, 2024

xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism

Python 663 54 Updated Nov 7, 2024

UTokyo-SaruLab MOS Prediction System

Python 78 7 Updated Sep 27, 2024

PyTorch implementation of MAR DiffLoss https://arxiv.org/abs/2406.11838

Python 981 54 Updated Sep 27, 2024

Real-time Speech-Text Foundation Model Toolkit (wip)

Python 119 11 Updated Oct 14, 2024

Official Jax Implementation of MaskGIT

Jupyter Notebook 440 50 Updated Nov 18, 2022

PromptTTS : Controlling Speaker Identity in Prompt-Based Text-To-Speech Using Natural Language Descriptions

Python 55 4 Updated Oct 11, 2024

SCOREQ: Speech COntrastive REgression for Quality Assessment (NeurIPS 2024)

Python 37 3 Updated Oct 18, 2024

[NeurIPS 2024] The official code of "U-DiTs: Downsample Tokens in U-Shaped Diffusion Transformers"

Python 111 6 Updated Sep 30, 2024

Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"

Python 6,710 781 Updated Nov 7, 2024

PyTorch Implementation of TCSinger(EMNLP 2024): Zero-Shot Singing Voice Synthesis with Style Transfer and Multi-Level Style Control

34 Updated Oct 16, 2024

[ICCV 2023] Online Clustered Codebook

Python 145 11 Updated Sep 19, 2024

Implementation of Liquid Nets in Pytorch

Python 51 7 Updated Nov 4, 2024

[EMNLP 2024] ESC: Efficient Speech Coding with Cross-Scale Residual Vector Quantized Transformers

Jupyter Notebook 90 4 Updated Oct 21, 2024

This repository contains the code and data for the paper EmoKnob: Enhance Voice Cloning with Fine-Grained Emotion Control by Haozhe Chen, Run Chen, and Julia Hirschberg.

Python 36 3 Updated Oct 3, 2024

Dataset and code of GTSinger(NeurIPS 2024 Spotlight): A Global Multi-Technique Singing Corpus with Realistic Music Scores for All Singing Tasks

Python 219 9 Updated Oct 29, 2024

(NeurIPS 2024 Oral 🔥) Improved Distribution Matching Distillation for Fast Image Synthesis

Python 495 28 Updated Sep 27, 2024
Python 57 7 Updated Sep 3, 2024

PyTorch implementation of WaveFit [2022, Google] which is one of SOTA lightweight/fast speech vocoders.

Python 45 3 Updated Oct 12, 2024

Omni SenseVoice: High-Speed Speech Recognition with words timestamps 🗣️🎯

Python 679 24 Updated Nov 6, 2024

Ultra-low-bitrate Speech Codec for Speech Language Modeling Applications

Python 61 3 Updated Sep 20, 2024

An Open-Sourced LLM-empowered Foundation TTS System

Python 416 29 Updated Oct 17, 2024
Next