Stars
📰 Must-read papers and blogs on LLM based Long Context Modeling 🔥
[NeurIPS 2024🔥] DreamClear: High-Capacity Real-World Image Restoration with Privacy-Safe Dataset Curation
VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs
[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
Paddle Multimodal Integration and eXploration, supporting mainstream multi-modal tasks, including end-to-end large-scale multi-modal pretrain models and diffusion model toolbox. Equipped with high …
The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
Repository of notes, code and notebooks in Python for the book Pattern Recognition and Machine Learning by Christopher Bishop
[ICLR2024] Official repo for paper "PnP Inversion: Boosting Diffusion-based Editing with 3 Lines of Code"
Official implementation of Posterior-Mean Rectified Flow: Towards Minimum MSE Photo-Realistic Image Restoration
This repository contains the official implementation of "FlowIE: Efficient Image Enhancement via Rectified Flow"
Official repo of our paper "SDXS: Real-Time One-Step Latent Diffusion Models with Image Conditions"
(NeurIPS 2024 Oral 🔥) Improved Distribution Matching Distillation for Fast Image Synthesis
[ECCV 2024] Restore Anything with Masks: Leveraging Mask Image Modeling for Blind All-in-One Image Restoration
PyTorch implementation of "Deep Equilibrium Diffusion Restoration with Parallel Sampling (CVPR 2024)"
[ICLR 2024] Controlling Vision-Language Models for Universal Image Restoration. 5th place in the NTIRE 2024 Restore Any Image Model in the Wild Challenge.
👁️ 🖼️ 🔥PyTorch Toolbox for Image Quality Assessment, including LPIPS, FID, NIQE, NRQM(Ma), MUSIQ, TOPIQ, NIMA, DBCNN, BRISQUE, PI and more...
Generative Models by Stability AI
[CVPR 2024] SinSR: Diffusion-Based Image Super-Resolution in a Single Step
PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis
[CVPR 2024] Official code release of our paper "Diff-Plugin: Revitalizing Details for Diffusion-based Low-level tasks"
Official implemention of "Make It Count: Text-to-Image Generation with an Accurate Number of Objects"
[ECCV2024] Pixel-Aware Stable Diffusion for Realistic Image Super-Resolution and Personalized Stylization
[CVPR2024] SeeSR: Towards Semantics-Aware Real-World Image Super-Resolution