Skip to content
View ZhendongWang6's full-sized avatar
🎯
Focusing
🎯
Focusing

Highlights

  • Pro

Block or report ZhendongWang6

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Beta Lists are currently in beta. Share feedback and report bugs.
Showing results
Python 555 32 Updated Oct 31, 2024

田柯宇 (Tian Keyu)恶意攻击集群事件的证据揭露

572 39 Updated Oct 20, 2024

[NeurIPS 2024] Visual Perception by Large Language Model’s Weights

Python 26 1 Updated Oct 17, 2024

🔥ImageFolder: Autoregressive Image Generation with Folded Tokens

51 Updated Oct 15, 2024

CAR: Controllable AutoRegressive Modeling for Visual Generation

46 Updated Oct 8, 2024
Python 1,571 114 Updated Sep 23, 2024

Conceptual Captions is a dataset containing (image-URL, caption) pairs designed for the training and evaluation of machine learned image captioning systems.

Shell 517 26 Updated Aug 21, 2021

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…

Jupyter Notebook 12,054 1,087 Updated Oct 14, 2024

code for "Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion"

Python 567 28 Updated Oct 14, 2024

Kolors Team

Python 3,793 260 Updated Sep 4, 2024

(ෆ`꒳´ෆ) A Survey on Text-to-Image Generation/Synthesis.

2,147 190 Updated Oct 9, 2024

[ECCV 2024 Oral] Code for paper: An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference Acceleration for Large Vision-Language Models

Python 264 9 Updated Aug 12, 2024

Official implementation of "EG4D: Explicit Generation of 4D Object without Score Distillation"

17 2 Updated May 29, 2024

Lumina-T2X is a unified framework for Text to Any Modality Generation

Python 2,064 87 Updated Aug 6, 2024

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

3,545 148 Updated Sep 25, 2024

[CVPR 2024] Code release for "InstanceDiffusion: Instance-level Control for Image Generation"

Python 501 27 Updated Jul 16, 2024

[ICML 2024] Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs (RPG)

Jupyter Notebook 1,685 97 Updated Oct 10, 2024

Code for "Diffusion Model Alignment Using Direct Preference Optimization"

Python 257 23 Updated Dec 28, 2023

[NeurIPS 2024 Oral][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-sim…

Python 4,198 310 Updated Oct 6, 2024

Latte: Latent Diffusion Transformer for Video Generation.

Python 1,688 176 Updated Sep 28, 2024

VideoSys: An easy and efficient system for video generation

Python 1,747 116 Updated Oct 31, 2024

[NeurIPS 2024] GaussianCube: A Structured and Explicit Radiance Representation for 3D Generative Modeling

Python 342 17 Updated Oct 25, 2024
Python 446 12 Updated Sep 16, 2024

One-step image-to-image with Stable Diffusion turbo: sketch2image, day2night, and more

Python 1,593 180 Updated Sep 8, 2024

Grok open release

Python 49,518 8,317 Updated Aug 30, 2024

Official Implementation of Rectified Flow (ICLR2023 Spotlight)

Python 906 53 Updated Jul 20, 2024

a state-of-the-art-level open visual language model | 多模态预训练模型

Python 6,044 413 Updated May 29, 2024

[WIP] Layer Diffusion for WebUI (via Forge)

Python 3,858 331 Updated Aug 30, 2024

A collection of resources on controllable generation with text-to-image diffusion models.

896 26 Updated Oct 7, 2024

[ICML 2024 Spotlight] FiT: Flexible Vision Transformer for Diffusion Model

Python 373 9 Updated Oct 31, 2024
Next