Skip to content
View m-bain's full-sized avatar

Block or report m-bain

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

[ACCV 2024] Official Implementation of "AutoAD-Zero: A Training-Free Framework for Zero-Shot Audio Description". Junyu Xie, Tengda Han, Max Bain, Arsha Nagrani, Gül Varol, Weidi Xie, Andrew Zisserman

Python 17 1 Updated Sep 27, 2024

Multimodal language model benchmark, featuring challenging examples

Python 148 6 Updated Aug 13, 2024
Python 282 7 Updated Jan 27, 2024

Structured Text Generation

Python 8,838 441 Updated Oct 18, 2024

The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.

Python 7,772 455 Updated May 3, 2024

GPU & Accelerator process monitoring for AMD, Apple, Huawei, Intel, NVIDIA and Qualcomm

C 8,109 294 Updated Aug 31, 2024

LLM training code for Databricks foundation models

Python 4,020 524 Updated Oct 21, 2024

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 19,809 2,176 Updated Aug 12, 2024

A Data Streaming Library for Efficient Neural Network Training

Python 1,107 138 Updated Oct 18, 2024

Reference implementation for DPO (Direct Preference Optimization)

Python 2,102 172 Updated Aug 11, 2024

MeetEval - A meeting transcription evaluation toolkit

Python 75 14 Updated Oct 18, 2024

INTERSPEECH 2023-2024 Papers: A complete collection of influential and exciting research papers from the INTERSPEECH 2023-24 conference. Explore the latest advances in speech and language processin…

635 42 Updated Aug 9, 2024
Python 13 2 Updated Jun 14, 2024

Tools for handling speech data in machine learning projects.

Python 941 215 Updated Oct 7, 2024

Easily create large video dataset from video urls

Python 541 65 Updated Jul 30, 2024

Balancing the Picture: Debiasing Vision-Language Datasets with Synthetic Contrast Sets

Python 10 1 Updated May 25, 2023

String-to-String Algorithms for Natural Language Processing

Jupyter Notebook 533 27 Updated Jul 26, 2024

ImageBind One Embedding Space to Bind Them All

Python 8,299 761 Updated Jul 31, 2024

the subtitle editor :)

C# 8,552 894 Updated Oct 20, 2024

Simple Diarization model

Python 41 3 Updated Nov 29, 2023
Python 15 1 Updated Sep 25, 2023

Standalone implementation of the CUDA-accelerated WFST Decoder available in Riva

Python 80 23 Updated Aug 14, 2024

Minimal extension of OpenAI's Whisper adding speaker diarization with special tokens

Python 435 14 Updated Nov 6, 2023

GPT4All: Run Local LLMs on Any Device. Open-source and available for commercial use.

C 70,128 7,663 Updated Oct 20, 2024

[CVPR'23 Highlight] AutoAD: Movie Description in Context.

Python 86 1 Updated Jul 23, 2024

A database of movie scripts from several sources

Python 152 26 Updated May 3, 2024

Inference code for Llama models

Python 56,082 9,530 Updated Aug 18, 2024

gpu tester detects broken and slow gpus in a cluster

Python 67 6 Updated Feb 19, 2023

Implementation of "Slow-Fast Auditory Streams for Audio Recognition, ICASSP, 2021" in PyTorch

Python 69 15 Updated Sep 27, 2021

LAVIS - A One-stop Library for Language-Vision Intelligence

Jupyter Notebook 9,797 962 Updated Oct 11, 2024
Next