- Shanghai
-
18:24
(UTC 08:00) - kobeshegu.github.io
- @kobeshegu
- https://www.zhihu.com/people/ke-ke-ke-ke-ke-da-xia
Block or Report
Block or report kobeshegu
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseStars
Language
Sort by: Recently starred
Official Implementation of "Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining"
Text-to-video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
Official implementation of Add-SD: Rational Generation without Manual Reference.
[ECCV2024] Towards Reliable Advertising Image Generation Using Human Feedback
Official inference repo for FLUX.1 models
[arXiv preprint] Hi-SAM: Marrying Segment Anything Model for Hierarchical Text Segmentation
PiSSA: Principal Singular Values and Singular Vectors Adaptation of Large Language Models
PyTorch implementation of MAR DiffLoss https://arxiv.org/abs/2406.11838
Utilities intended for use with Llama models.
A curated list of foundation models for vision and language tasks
Tracking and collecting papers/projects/others related to Segment Anything.
A PyTorch implementation of the paper "ZigMa: A DiT-Style Mamba-based Diffusion Model" (ECCV 2024)
[ECCV 2024] 3DPE: Real-time 3D-aware Portrait Editing from a Single Image
SEED-Story: Multimodal Long Story Generation with Large Language Model
Understand Human Behavior to Align True Needs
Vico: Compositional Video Generation as Flow Equalization
[ECCV 2024] Official Repository for DiffiT: Diffusion Vision Transformers for Image Generation
Cambrian-1 is a family of multimodal LLMs with a vision-centric design.
[BSQ-ViT] Image and Video Tokenization with Binary Spherical Quantization
[CVPR 2024 Highlight] "MIGC: Multi-Instance Generation Controller for Text-to-Image Synthesis" (Official Implementation)
[ECCV2024] This is an official inference code of the paper "Glyph-ByT5: A Customized Text Encoder for Accurate Visual Text Rendering" and "Glyph-ByT5-v2: A Strong Aesthetic Baseline for Accurate Mu…
FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds. AI拟音大师,给你的无声视频添加生动而且同步的音效 😝
Official Implementation for "Consistency Flow Matching: Defining Straight Flows with Velocity Consistency"