Block or Report
Block or report solee0022
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseStars
Language
Sort by: Recently starred
Code for paper "Self-Taught Recognizer: Toward Unsupervised Adaptation for Speech Foundation Models"
Keyword spotting and forced alignment in any language
Data manipulation and transformation for audio signal processing, powered by PyTorch
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
Code and Pretrained Models for Interspeech 2023 Paper "Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong Audio Event Taggers"
Multilingual Automatic Speech Recognition with word-level timestamps and confidence
Segment an audio file and obtain utterance alignments. (Python package)
Finetune Llama 3.1, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memory
Code and documentation to train Stanford's Alpaca models, and generate the data.
BLSP: Bootstrapping Langauge-Speech Pre-training via Behavior Alignment of Continuation Writing
Pretrain, finetune and deploy AI models on multiple GPUs, TPUs with zero code changes.
Awesome speech/audio LLMs, representation learning, and codec models
This is a list of speech tasks and datasets, which can provide training data for Generative AI, AIGC, AI model training, intelligent speech tool development, and speech applications.
EMNLP 23 - Integrating Whisper Encoder to LLaMA Decoder for Generative ASR Error Correction
A high-throughput and memory-efficient inference and serving engine for LLMs
SpeechGPT Series: Speech Large Language Models
Single-blind supplementary materials for NeurIPS 2023 submission
Code for paper "Large Language Models are Efficient Learners of Noise-Robust Speech Recognition"
Code, Dataset, and Pretrained Models for Audio and Speech Large Language Model "Listen, Think, and Understand".