Starred repositories
Official research projects of badminton CoachAI
In this group project carried out with @Anannyap7, the aim is to take a professional badminton match video as an input and predict the most probable space on the court where the shot will be hit by…
A novel Multimodal Large Language Model (MLLM) architecture, designed to structurally align visual and textual embeddings.
AAAI-24 Decoupled Contrastive Learning for Long-Tailed Recognition
2017年买房经历总结出来的买房购房知识分享给大家,希望对大家有所帮助。买房不易,且买且珍惜。Sharing the knowledge of buy an own house that according to the experience at hangzhou in 2017 to all the people. It's not easy to buy a own house, so…
2020年11月在上海买房经历总结出来的买房购房做的一些功课分享给大家,技术人帮助技术人,希望对大家有所帮助。
OpenMMLab Multimodal Advanced, Generative, and Intelligent Creation Toolbox. Unlock the magic 🪄: Generative-AI (AIGC), easy-to-use APIs, awsome model zoo, diffusion models, for text-to-image genera…
[ACL 2023] Official PyTorch code for Singularity model in "Revealing Single Frame Bias for Video-and-Language Learning"
Implementation of CoCa, Contrastive Captioners are Image-Text Foundation Models, in Pytorch
✨✨Woodpecker: Hallucination Correction for Multimodal Large Language Models. The first work to correct hallucinations in MLLMs.
Official PyTorch implementation of Fully Attentional Networks
A high-throughput and memory-efficient inference and serving engine for LLMs
A curated list of foundation models for vision and language tasks
ReFT: Representation Finetuning for Language Models
Mixture-of-Experts for Large Vision-Language Models
Command-line program to download videos from YouTube.com and other video sites
🤗 Evaluate: A library for easily evaluating machine learning models and datasets.
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
A deep learning library for video understanding research.
[ICML 2024] Break the Sequential Dependency of LLM Inference Using Lookahead Decoding
Codebase for Merging Language Models (ICML 2024)
【CVPR 2024 Highlight】Monkey (LMM): Image Resolution and Text Label Are Important Things for Large Multi-modal Models
【EMNLP 2024🔥】Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
Keyphrase or Keyword Extraction 基于预训练模型的中文关键词抽取方法(论文SIFRank: A New Baseline for Unsupervised Keyphrase Extraction Based on Pre-trained Language Model 的中文版代码)
SpaCy 中文模型 | Models for SpaCy that support Chinese
中文 NLP 预处理、解析工具包,准确、高效、易用 A Chinese NLP Preprocessing & Parsing Package www.jionlp.com