Stars
Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.
Get up and running with Llama 3.2, Mistral, Gemma 2, and other large language models.
3D Point Cloud Annotation Platform for Autonomous Driving
[NeurIPS 2023] Official code of "One-2-3-45: Any Single Image to 3D Mesh in 45 Seconds without Per-Shape Optimization"
用于从头预训练 SFT一个小参数量的中文LLaMa2的仓库;24G单卡即可运行得到一个具备简单中文问答能力的chat-llama2.
Use PEFT or Full-parameter to finetune 400 LLMs or 100 MLLMs. (LLM: Qwen2.5, Llama3.2, GLM4, Internlm2.5, Yi1.5, Mistral, Baichuan2, DeepSeek, Gemma2, ...; MLLM: Qwen2-VL, Qwen2-Audio, Llama3.2-V…
DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos
PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis
MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone
Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
OpenMMLab Multimodal Advanced, Generative, and Intelligent Creation Toolbox. Unlock the magic 🪄: Generative-AI (AIGC), easy-to-use APIs, awsome model zoo, diffusion models, for text-to-image genera…
fast-stable-diffusion DreamBooth
Implementation of Dreambooth (https://arxiv.org/abs/2208.12242) with Stable Diffusion
[NeurIPS2024] An official implement of the paper "Parameter Efficient Adaptation for Image Restoration with Heterogeneous Mixture-of-Experts"
[ECCV 2024] InstructIR: High-Quality Image Restoration Following Human Instructions https://huggingface.co/spaces/marcosv/InstructIR
本项目旨在分享大模型相关技术原理以及实战经验(大模型工程化、大模型应用落地)
OpenMMLab's next-generation platform for general 3D object detection.
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
OpenStereo: A Comprehensive Benchmark for Stereo Matching and Strong Baseline
PointNet and PointNet implemented by pytorch (pure python) and on ModelNet, ShapeNet and S3DIS.
Pointcept: a codebase for point cloud perception research. Latest works: PTv3 (CVPR'24 Oral), PPT (CVPR'24), OA-CNNs (CVPR'24), MSC (CVPR'23)
CVPR2024 - Transcending the Limit of Local Window: Advanced Super-Resolution Transformer with Adaptive Token Dictionary
ControlNet : All-in-one ControlNet for image generations and editing!