Skip to content
View seungheondoh's full-sized avatar
🌊
Ph.D Journey
🌊
Ph.D Journey

Highlights

  • Pro

Organizations

@KAIST-MACLab @mulab-mir

Block or report seungheondoh

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Towards Robust Transcription: Exploring Noise Injection Strategies for Training Data Augmentation

Jupyter Notebook 4 Updated Oct 22, 2024

CLaMP 2: Multimodal Music Information Retrieval Across 101 Languages Using Large Language Models

Python 40 1 Updated Oct 21, 2024

Inference codebase for "Cacophony: An Improved Contrastive Audio-Text Model". Preprint: https://arxiv.org/abs/2402.06986

Python 34 4 Updated Oct 13, 2024

Enriching Music Descriptions with a Finetuned-LLM and Metadata for Text-to-Music Retrieval (TTMR ) [ICASSP24]

Python 28 1 Updated Oct 7, 2024

A piano music dataset with Audio, Symbolic and Text labels

8 Updated Sep 26, 2024

Awesome Papers Using Vector Quantization for Recommender Systems (VQ4Rec)

15 1 Updated May 6, 2024

PDMX: A Large-Scale Public Domain MusicXML Dataset for Symbolic Music Processing

Python 33 2 Updated Oct 30, 2024
Python 6,581 500 Updated Oct 31, 2024

PyTorch implementation of MAR DiffLoss https://arxiv.org/abs/2406.11838

Python 959 50 Updated Sep 27, 2024

MuChoMusic is a benchmark for evaluating music understanding in multimodal audio-language models.

Jupyter Notebook 22 1 Updated Aug 9, 2024

Faster Whisper transcription with CTranslate2

Python 12,173 1,020 Updated Oct 30, 2024

Awesome Music Projects

1,864 109 Updated Oct 6, 2024

LLM101n: Let's build a Storyteller

29,568 1,616 Updated Aug 1, 2024

Official Repository of Unsupervised Lead Sheet Generation via Semantic Compression

Python 15 1 Updated Oct 23, 2023

⚡ InstaFlow! One-Step Stable Diffusion with Rectified Flow (ICLR 2024)

Python 1,176 37 Updated Jun 7, 2024

Scripts for fine-tuning Meta Llama with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting…

Jupyter Notebook 14,524 2,120 Updated Oct 31, 2024

Code for GFlowNet-EM, a novel algorithm for fitting latent variable models with compositional latents and an intractable true posterior.

Jupyter Notebook 38 2 Updated Feb 9, 2024
TeX 37 Updated Jun 16, 2024
Python 137 22 Updated Oct 13, 2024

PyTorch implementation of the ICASSP-24 paper: "Improving Audio Captioning Models with Fine-grained Audio Features, Text Embedding Supervision, and LLM Mix-up Augmentation"

Jupyter Notebook 30 1 Updated Jan 6, 2024

MR-MT3: Memory Retaining Multi-Track Music Transcription to Mitigate Instrument Leakage

Python 36 2 Updated Jul 12, 2024

Musical Word Embedding for Music Tagging and Retrieval [IEEE TASLP]

Jupyter Notebook 21 Updated Apr 23, 2024

[AAAI'24] Official dataset & demo code for MID-FiLD: MIDI Dataset for Fine-Level Dynamics

12 Updated Mar 31, 2024

A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning

Python 586 206 Updated Aug 30, 2021

This is a cog implementation of the fine-tuner for Meta's MusicGen

Python 47 10 Updated Apr 5, 2024

Machine Learning Engineering Open Book

Python 11,542 702 Updated Oct 30, 2024

Paper list about hyperbolic embedding, hyperbolic models,hyperbolic applications

358 30 Updated Sep 9, 2024

LP-MusicCaps: LLM-Based Pseudo Music Captioning [ISMIR23]

Python 279 33 Updated Apr 8, 2024
Next