jisang93

Follow

🔥

ji jisang93

🔥

Follow

AI Research Engineer

19 followers · 21 following

Netmarble AI Center
Seoul, Republic of Korea

Achievements

Achievements

Lists (8)

Sort

Audio Generation

114 repositories

Awesome Information

22 repositories

Computer Vision

53 repositories

Dataset

Dataset for deep-learning

16 repositories

Deep Learning

264 repositories

Generative Models

256 repositories

Natural Language Process

43 repositories

Speech Synthesis

156 repositories

Beta Lists are currently in beta. Share feedback and report bugs.

Stars

cantabile-kwok / vec2wav2.0

Code for vec2wav 2.0, a speech token vocoder for VC. Paper: https://arxiv.org/abs/2409.01995

Python 42 4 Updated Nov 7, 2024

walker-hyf / GPT-Talker

Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)

Python 40 3 Updated Nov 1, 2024

youngsheen / SimVQ

SimVQ: Addressing Representation Collapse in Vector Quantized Models with One Linear Layer

Python 76 4 Updated Nov 7, 2024

justinlovelace / SESD

Python 48 2 Updated Oct 28, 2024

YoonhyungLee94 / TadaStride

Official PyTorch implementation of the paper "AdaStride: Using Adaptive Strides in Sequential Data for Effective Downsampling"

Python 8 Updated Mar 29, 2024

X-niper / UniTalker

Python 124 8 Updated Sep 5, 2024

walker-hyf / NCSSD

Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)

Python 38 1 Updated Nov 1, 2024

codecaution / Awesome-Mixture-of-Experts-Papers

A curated reading list of research in Mixture-of-Experts(MoE).

533 41 Updated Oct 30, 2024

AMAAI-Lab / DART

Demo for DART, Audio Imagination workshop submission in NeurIPS 2024

HTML 7 1 Updated Oct 17, 2024

xdit-project / xDiT

xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism

Python 663 54 Updated Nov 7, 2024

sarulab-speech / UTMOSv2

UTokyo-SaruLab MOS Prediction System

Python 78 7 Updated Sep 27, 2024

LTH14 / mar

PyTorch implementation of MAR DiffLoss https://arxiv.org/abs/2406.11838

Python 981 54 Updated Sep 27, 2024

yangdongchao / RSTnet

Real-time Speech-Text Foundation Model Toolkit (wip)

Python 119 11 Updated Oct 14, 2024

google-research / maskgit

Official Jax Implementation of MaskGIT

Jupyter Notebook 440 50 Updated Nov 18, 2022

line / promptttspp

PromptTTS : Controlling Speaker Identity in Prompt-Based Text-To-Speech Using Natural Language Descriptions

Python 55 4 Updated Oct 11, 2024

alessandroragano / scoreq

SCOREQ: Speech COntrastive REgression for Quality Assessment (NeurIPS 2024)

Python 37 3 Updated Oct 18, 2024

YuchuanTian / U-DiT

[NeurIPS 2024] The official code of "U-DiTs: Downsample Tokens in U-Shaped Diffusion Transformers"

Python 111 6 Updated Sep 30, 2024

SWivid / F5-TTS

Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"

Python 6,710 781 Updated Nov 7, 2024

AaronZ345 / TCSinger

PyTorch Implementation of TCSinger(EMNLP 2024): Zero-Shot Singing Voice Synthesis with Style Transfer and Multi-Level Style Control

34 Updated Oct 16, 2024

lyndonzheng / CVQ-VAE

[ICCV 2023] Online Clustered Codebook

Python 145 11 Updated Sep 19, 2024

kyegomez / LiqudNet

Implementation of Liquid Nets in Pytorch

Python 51 7 Updated Nov 4, 2024

yzGuu830 / efficient-speech-codec

[EMNLP 2024] ESC: Efficient Speech Coding with Cross-Scale Residual Vector Quantized Transformers

Jupyter Notebook 90 4 Updated Oct 21, 2024

tonychenxyz / emoknob

This repository contains the code and data for the paper EmoKnob: Enhance Voice Cloning with Fine-Grained Emotion Control by Haozhe Chen, Run Chen, and Julia Hirschberg.

Python 36 3 Updated Oct 3, 2024

GTSinger / GTSinger

Dataset and code of GTSinger(NeurIPS 2024 Spotlight): A Global Multi-Technique Singing Corpus with Realistic Music Scores for All Singing Tasks

Python 219 9 Updated Oct 29, 2024

tianweiy / DMD2

(NeurIPS 2024 Oral 🔥) Improved Distribution Matching Distillation for Fast Image Synthesis

Python 495 28 Updated Sep 27, 2024

bfs18 / e2_tts

Python 57 7 Updated Sep 3, 2024

yukara-ikemiya / wavefit-pytorch

PyTorch implementation of WaveFit [2022, Google] which is one of SOTA lightweight/fast speech vocoders.

Python 45 3 Updated Oct 12, 2024

lifeiteng / OmniSenseVoice

Omni SenseVoice: High-Speed Speech Recognition with words timestamps 🗣️🎯

Python 679 24 Updated Nov 6, 2024

hhguo / SoCodec

Ultra-low-bitrate Speech Codec for Speech Language Modeling Applications

Python 61 3 Updated Sep 20, 2024

FireRedTeam / FireRedTTS

An Open-Sourced LLM-empowered Foundation TTS System

Python 416 29 Updated Oct 17, 2024