Skip to content
View m-bain's full-sized avatar
Block or Report

Block or report m-bain

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Official Implementation of "AutoAD-Zero: A Training-Free Framework for Zero-Shot Audio Description". Junyu Xie, Tengda Han, Max Bain, Arsha Nagrani, Gül Varol, Weidi Xie, Andrew Zisserman

Python 14 Updated Jul 23, 2024

Multimodal language model benchmark, featuring challenging examples

Python 140 6 Updated Aug 13, 2024
Python 264 7 Updated Jan 27, 2024

Structured Text Generation

Python 7,925 393 Updated Aug 16, 2024

The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.

Python 7,507 443 Updated May 3, 2024

GPU & Accelerator process monitoring for AMD, Apple, Huawei, Intel, NVIDIA and Qualcomm

C 7,881 289 Updated Aug 7, 2024

LLM training code for Databricks foundation models

Python 3,920 515 Updated Aug 17, 2024

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 18,764 2,057 Updated Aug 12, 2024

A Data Streaming Library for Efficient Neural Network Training

Python 1,057 133 Updated Aug 15, 2024

Reference implementation for DPO (Direct Preference Optimization)

Python 1,953 156 Updated Aug 11, 2024

MeetEval - A meeting transcription evaluation toolkit

Python 72 14 Updated Aug 14, 2024

INTERSPEECH 2023-2024 Papers: A complete collection of influential and exciting research papers from the INTERSPEECH 2023-24 conference. Explore the latest advances in speech and language processin…

619 42 Updated Aug 9, 2024
Python 12 2 Updated Jun 14, 2024

Tools for handling speech data in machine learning projects.

Python 915 209 Updated Aug 14, 2024

Easily create large video dataset from video urls

Python 521 60 Updated Jul 30, 2024

Balancing the Picture: Debiasing Vision-Language Datasets with Synthetic Contrast Sets

Python 10 1 Updated May 25, 2023

String-to-String Algorithms for Natural Language Processing

Jupyter Notebook 517 27 Updated Jul 26, 2024

ImageBind One Embedding Space to Bind Them All

Python 8,157 745 Updated Jul 31, 2024

the subtitle editor :)

C# 8,053 876 Updated Aug 17, 2024

Simple Diarization model

Python 37 3 Updated Nov 29, 2023
Python 14 1 Updated Sep 25, 2023

Standalone implementation of the CUDA-accelerated WFST Decoder available in Riva

Python 78 23 Updated Aug 14, 2024

Minimal extension of OpenAI's Whisper adding speaker diarization with special tokens

Python 406 14 Updated Nov 6, 2023

GPT4All: Chat with Local LLMs on Any Device

C 68,583 7,518 Updated Aug 17, 2024

[CVPR'23 Highlight] AutoAD: Movie Description in Context.

Python 85 Updated Jul 23, 2024

A database of movie scripts from several sources

Python 147 24 Updated May 3, 2024

Inference code for Llama models

Python 55,109 9,410 Updated Aug 18, 2024

gpu tester detects broken and slow gpus in a cluster

Python 63 6 Updated Feb 19, 2023

Implementation of "Slow-Fast Auditory Streams for Audio Recognition, ICASSP, 2021" in PyTorch

Python 68 15 Updated Sep 27, 2021

LAVIS - A One-stop Library for Language-Vision Intelligence

Jupyter Notebook 9,380 926 Updated Aug 16, 2024
Next