Starred repositories
More suitable IP-Adapter for the DiT architecture
LayerDiffuse in pure diffusers without any GUI
Benchmarking Generalized Out-of-Distribution Detection
[MICCAI 2024] Codebase for "Stable Diffusion Segmentation for Biomedical Images with Single-step Reverse Process"
Multi-Class Few-Shot Semantic Segmentation with Visual Prompts
Official PyTorch Implementation of DIaM in "A Strong Baseline for Generalized Few-Shot Semantic Segmentation" (CVPR 2023)
[ICCV 2021 Oral] Mining Latent Classes for Few-shot Segmentation
Official Implementation of VAT
High-Performance Few-Shot Segmentation with Foundation Models: An Empirical Study
This repo contains documentation and code needed to use PACO dataset: data loaders and training and evaluation scripts for objects, parts, and attributes prediction models, query evaluation scripts…
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
Analogist: Out-of-the-box Visual In-Context Learning with Image Diffusion Model (SIGGRAPH 2024)
Implementation of "Image Restoration Through Generalized Ornstein-Uhlenbeck Bridge", accepted by ICML 2024.
GIM: Learning Generalizable Image Matcher From Internet Videos (ICLR 2024 Spotlight)
Use PEFT or Full-parameter to finetune 350 LLMs or 90 MLLMs. (LLM: Qwen2.5, Llama3.2, GLM4, Internlm2.5, Yi1.5, Mistral, Baichuan2, DeepSeek, Gemma2, ...; MLLM: Qwen2-VL, Qwen2-Audio, Llama3.2-Vi…
[ECCV2024] Towards Reliable Advertising Image Generation Using Human Feedback
[CVPR 2024] Official code for "Text-Driven Image Editing via Learnable Regions"
Repository for Show-o, One Single Transformer to Unify Multimodal Understanding and Generation.
📷 [ECCV 2024] RAW-Adapter: Adapting Pre-trained Visual Model to Camera RAW Images