Stars
Software that can generate photos from paintings, turn horses into zebras, perform style transfer, and more.
✨✨ MME-RealWorld: Could Your Multimodal LLM Challenge High-Resolution Real-World Scenarios that are Difficult for Humans?
The official repository of our survey paper: "Towards a Unified View of Preference Learning for Large Language Models: A Survey"
Repo for the paper `ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large Language Models'
Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) on multi-GPU Clusters
Deep learning for dummies. All the practical details and useful utilities that go into working with real models.
aider is AI pair programming in your terminal
This repository collects papers for "A Survey on Knowledge Distillation of Large Language Models". We break down KD into Knowledge Elicitation and Distillation Algorithms, and explore the Skill & V…
📚 A collection of resources and papers on Vector Quantized Variational Autoencoder (VQ-VAE) and its application
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
Utilities intended for use with Llama models.
Vector (and Scalar) Quantization, in Pytorch
Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Baichuan, Mixtral, Gemma, Phi, MiniCPM, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discrete GPU such as Arc,…
[ACL 2024] PCA-Bench: Evaluating Multimodal Large Language Models in Perception-Cognition-Action Chain
[MM 2024] Official code for VeCAF: Vision-language Collaborative Active Finetuning with Training Objective Awareness
Awesome-LLM-KV-Cache: A curated list of 📙Awesome LLM KV Cache Papers with Codes.
A one-stop data processing system to make data higher-quality, juicier, and more digestible for (multimodal) LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大模型提供更高质量、更丰富、更易”消化“的数据!
Official github repo for the paper "Compression Represents Intelligence Linearly"
Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton
Efficiently Fine-Tune 100 LLMs in WebUI (ACL 2024)
📈 Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies https://arxiv.org/abs/2407.13623
Anole: An Open, Autoregressive and Native Multimodal Models for Interleaved Image-Text Generation
Codes for the paper "∞Bench: Extending Long Context Evaluation Beyond 100K Tokens": https://arxiv.org/abs/2402.13718
Curation of resources for LLM mathematical reasoning, most of which are screened by @tongyx361 to ensure high quality and accompanied with elaborately-written concise descriptions to help readers g…