Lists (6)
Sort Name ascending (A-Z)
awesome libraries
For efficiency and simplicitychat-gpt usecases
talented developers always surprise you with how they use gpt apiImage / Video Gen
image generation & edition(M)LLM
(Multimodality) Large Language Modelsopen dataset
Open Dataset Collectionstransformers in Everything
such as CV tasksStarred repositories
High-resolution models for human tasks.
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 and reasoning techniques.
VizTracer is a low-overhead logging/debugging/profiling tool that can trace and visualize your python code execution.
Virtual whiteboard for sketching hand-drawn like diagrams
The all-in-one Desktop & Docker AI application with built-in RAG, AI agents, and more.
A high-performance Python-based I/O system for large (and small) deep learning problems, with strong support for PyTorch.
SuperPrompt is an attempt to engineer prompts that might help us understand AI agents.
Visualizing the attention of vision-language models
Repository for Show-o, One Single Transformer to Unify Multimodal Understanding and Generation.
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery 🧑🔬
Official inference repo for FLUX.1 models
[ECCV 2022]Code for paper "DaViT: Dual Attention Vision Transformer"
real time face swap and one-click video deepfake with only a single image
Transformer Explained Visually: Learn How LLM Transformer Models Work with Interactive Visualization
InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output
SGLang is a fast serving framework for large language models and vision language models.
[CVPR 2024] Alpha-CLIP: A CLIP Model Focusing on Wherever You Want
Implementing a ChatGPT-like LLM in PyTorch from scratch, step by step
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
Official repository for "Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing". Your efficient and high-quality synthetic data generation pipeline!
Official code of "EVF-SAM: Early Vision-Language Fusion for Text-Prompted Segment Anything Model"
MINT-1T: A one trillion token multimodal interleaved dataset.
Official PyTorch implementation of Revisiting Image Pyramid Structure for High Resolution Salient Object Detection (ACCV 2022)
Anole: An Open, Autoregressive and Native Multimodal Models for Interleaved Image-Text Generation
Code for "Director3D: Real-world Camera Trajectory and 3D Scene Generation from Text" (NeurIPS 2024).
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.