This is a list of speech tasks and datasets, which can provide training data for Generative AI, AIGC, AI model training, intelligent speech tool development, and speech applications.

72 6 Updated Jun 7, 2024

Srijith-rkr / Whispering-LLaMA

EMNLP 23 - Integrating Whisper Encoder to LLaMA Decoder for Generative ASR Error Correction

Jupyter Notebook 209 15 Updated May 19, 2024

seonminkoo / KEBAP

2 Updated Nov 5, 2023

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 23,649 3,388 Updated Jul 26, 2024

meta-llama / llama

Inference code for Llama models

Python 54,575 9,359 Updated Jul 25, 2024

0nutation / SpeechGPT

SpeechGPT Series: Speech Large Language Models

Python 1,093 69 Updated Jul 22, 2024

Hypotheses-Paradise / Hypo2Trans

Single-blind supplementary materials for NeurIPS 2023 submission

Python 52 4 Updated Jun 6, 2024

YUCHEN005 / RobustGER

Code for paper "Large Language Models are Efficient Learners of Noise-Robust Speech Recognition"

Python 113 2 Updated May 8, 2024

microsoft / Pengi

An Audio Language model for Audio Tasks

Python 269 15 Updated Apr 19, 2024

YuanGongND / ltu

Code, Dataset, and Pretrained Models for Audio and Speech Large Language Model "Listen, Think, and Understand".

Python 340 27 Updated Apr 24, 2024

preservim / nerdtree

A tree explorer plugin for vim.

Vim Script 19,432 1,435 Updated Jul 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

solee0022

Block or report solee0022

Stars

YUCHEN005 / STAR-Adapt

lingjzhu / charsiu

lingjzhu / clap-ipa

pytorch / audio

AdolfVonKleist / Phonetisaurus

m-bain / whisperX

YuanGongND / whisper-at

microsoft / NeuralSpeech

huggingface / diarizers

asappresearch / wav2seq

csalt-research / accented-codebooks-asr

linto-ai / whisper-timestamped

lumaku / ctc-segmentation

unslothai / unsloth

tatsu-lab / stanford_alpaca

cwang621 / blsp

Lightning-AI / pytorch-lightning

meta-llama / llama3

ga642381 / speech-trident

WangHelin1997 / SpeechTasks