ZhendongWang6

🎯

Focusing

Zhendong Wang ZhendongWang6

🎯

Focusing

Ph.D. student, focus on computer vision and deep learning.

105 followers · 47 following

University of Science and Technology of China (USTC)
Hefei, China
16:29 (UTC 08:00)
https://zhendongwang6.github.io/
https://scholar.google.com.hk/citations?user=Ya5VDjQAAAAJ&hl=zh-CN

Achievements

Highlights

Lists (24)

Sort

Beta Lists are currently in beta. Share feedback and report bugs.

Stars

Stability-AI / sd3.5

Python 555 32 Updated Oct 31, 2024

JusticeFighterDance / JusticeFighter110

田柯宇 (Tian Keyu)恶意攻击集群事件的证据揭露

572 39 Updated Oct 20, 2024

FeipengMa6 / VLoRA

[NeurIPS 2024] Visual Perception by Large Language Model’s Weights

Python 26 1 Updated Oct 17, 2024

lxa9867 / ImageFolder

🔥ImageFolder: Autoregressive Image Generation with Folded Tokens

51 Updated Oct 15, 2024

MiracleDance / CAR

CAR: Controllable AutoRegressive Modeling for Visual Generation

46 Updated Oct 8, 2024

XLabs-AI / x-flux

Python 1,571 114 Updated Sep 23, 2024

google-research-datasets / conceptual-captions

Conceptual Captions is a dataset containing (image-URL, caption) pairs designed for the training and evaluation of machine learned image captioning systems.

Shell 517 26 Updated Aug 21, 2021

facebookresearch / sam2

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…

Jupyter Notebook 12,054 1,087 Updated Oct 14, 2024

buoyancy99 / diffusion-forcing

code for "Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion"

Python 567 28 Updated Oct 14, 2024

Kwai-Kolors / Kolors

Kolors Team

Python 3,793 260 Updated Sep 4, 2024

Yutong-Zhou-cv / Awesome-Text-to-Image

(ෆ`꒳´ෆ) A Survey on Text-to-Image Generation/Synthesis.

2,147 190 Updated Oct 9, 2024

pkunlp-icler / FastV

[ECCV 2024 Oral] Code for paper: An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference Acceleration for Large Vision-Language Models

Python 264 9 Updated Aug 12, 2024

jasongzy / EG4D

Official implementation of "EG4D: Explicit Generation of 4D Object without Score Distillation"

17 2 Updated May 29, 2024

Alpha-VLLM / Lumina-T2X

Lumina-T2X is a unified framework for Text to Any Modality Generation

Python 2,064 87 Updated Aug 6, 2024

deepseek-ai / DeepSeek-V2

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

3,545 148 Updated Sep 25, 2024

frank-xwang / InstanceDiffusion

[CVPR 2024] Code release for "InstanceDiffusion: Instance-level Control for Image Generation"

Python 501 27 Updated Jul 16, 2024

YangLing0818 / RPG-DiffusionMaster

[ICML 2024] Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs (RPG)

Jupyter Notebook 1,685 97 Updated Oct 10, 2024

SalesforceAIResearch / DiffusionDPO

Code for "Diffusion Model Alignment Using Direct Preference Optimization"

Python 257 23 Updated Dec 28, 2023

FoundationVision / VAR

[NeurIPS 2024 Oral][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-sim…

Python 4,198 310 Updated Oct 6, 2024