Skip to content
View m-bain's full-sized avatar

Block or report m-bain

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

[ACCV 2024] Official Implementation of "AutoAD-Zero: A Training-Free Framework for Zero-Shot Audio Description". Junyu Xie, Tengda Han, Max Bain, Arsha Nagrani, Gül Varol, Weidi Xie, Andrew Zisserman

Python 17 1 Updated Sep 27, 2024

Multimodal language model benchmark, featuring challenging examples

Python 145 6 Updated Aug 13, 2024
Python 280 7 Updated Jan 27, 2024

Structured Text Generation

Python 8,347 425 Updated Sep 28, 2024

The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.

Python 7,688 451 Updated May 3, 2024

GPU & Accelerator process monitoring for AMD, Apple, Huawei, Intel, NVIDIA and Qualcomm

C 8,023 291 Updated Aug 31, 2024

LLM training code for Databricks foundation models

Python 3,981 525 Updated Sep 27, 2024

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 19,516 2,148 Updated Aug 12, 2024

A Data Streaming Library for Efficient Neural Network Training

Python 1,084 136 Updated Sep 26, 2024

Reference implementation for DPO (Direct Preference Optimization)

Python 2,047 165 Updated Aug 11, 2024

MeetEval - A meeting transcription evaluation toolkit

Python 75 14 Updated Sep 20, 2024

INTERSPEECH 2023-2024 Papers: A complete collection of influential and exciting research papers from the INTERSPEECH 2023-24 conference. Explore the latest advances in speech and language processin…

631 42 Updated Aug 9, 2024
Python 13 2 Updated Jun 14, 2024

Tools for handling speech data in machine learning projects.

Python 934 214 Updated Sep 17, 2024

Easily create large video dataset from video urls

Python 533 65 Updated Jul 30, 2024

Balancing the Picture: Debiasing Vision-Language Datasets with Synthetic Contrast Sets

Python 10 1 Updated May 25, 2023

String-to-String Algorithms for Natural Language Processing

Jupyter Notebook 531 27 Updated Jul 26, 2024

ImageBind One Embedding Space to Bind Them All

Python 8,243 758 Updated Jul 31, 2024

the subtitle editor :)

C# 8,388 891 Updated Sep 28, 2024

Simple Diarization model

Python 40 3 Updated Nov 29, 2023
Python 15 1 Updated Sep 25, 2023

Standalone implementation of the CUDA-accelerated WFST Decoder available in Riva

Python 78 23 Updated Aug 14, 2024

Minimal extension of OpenAI's Whisper adding speaker diarization with special tokens

Python 429 14 Updated Nov 6, 2023

GPT4All: Run Local LLMs on Any Device. Open-source and available for commercial use.

C 69,608 7,616 Updated Sep 27, 2024

[CVPR'23 Highlight] AutoAD: Movie Description in Context.

Python 86 Updated Jul 23, 2024

A database of movie scripts from several sources

Python 150 24 Updated May 3, 2024

Inference code for Llama models

Python 55,698 9,497 Updated Aug 18, 2024

gpu tester detects broken and slow gpus in a cluster

Python 65 6 Updated Feb 19, 2023

Implementation of "Slow-Fast Auditory Streams for Audio Recognition, ICASSP, 2021" in PyTorch

Python 69 15 Updated Sep 27, 2021

LAVIS - A One-stop Library for Language-Vision Intelligence

Jupyter Notebook 9,705 951 Updated Aug 23, 2024
Next