Stars
Language
Sort by: Recently starred
Estimate absolute 3D human poses from RGB images.
本项目是一个通过文字生成图片的项目,基于开源模型Stable Diffusion V1.5生成可以在手机的CPU和NPU上运行的模型,包括其配套的模型运行框架。
Official Implementation of the ICCV 2023 paper: Perpetual Humanoid Control for Real-time Simulated Avatars
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
BoxMOT: pluggable SOTA tracking modules for segmentation, object detection and pose estimation models
TRAM: Global Trajectory and Motion of 3D Humans from in-the-wild Videos
Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation
Enjoy the magic of Diffusion models!
Node graph framework that can be re-implemented into applications that supports PySide2
[Arxiv-2024] MotionLLM: Understanding Human Behaviors from Human Motions and Videos
MotionFollower: Editing Video Motion via Lightweight Score-Guided Diffusion
Official code for "Towards An End-to-End Framework for Flow-Guided Video Inpainting" (CVPR2022)
Inpaint anything using Segment Anything and inpainting models.
Next generation face swapper and enhancer
InstantID : Zero-shot Identity-Preserving Generation in Seconds 🔥
[ECCV 2022] XMem: Long-Term Video Object Segmentation with an Atkinson-Shiffrin Memory Model
Track-Anything is a flexible and interactive tool for video object tracking and segmentation, based on Segment Anything, XMem, and E2FGVI.
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
HumanML3D: A large and diverse 3d human motion-language dataset.
C implementation of ChatGLM-6B & ChatGLM2-6B & ChatGLM3 & GLM4
ChatGLM3 series: Open Bilingual Chat LLMs | 开源双语对话语言模型
A series of large language models developed by Baichuan Intelligent Technology
Efficiently Fine-Tune 100 LLMs in WebUI (ACL 2024)
LAVIS - A One-stop Library for Language-Vision Intelligence
PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
[NeurIPS 2023] MotionGPT: Human Motion as a Foreign Language, a unified motion-language generation model using LLMs
Monocular, One-stage, Regression of Multiple 3D People and their 3D positions & trajectories in camera & global coordinates. ROMP[ICCV21], BEV[CVPR22], TRACE[CVPR2023]
リアルタイムボイスチェンジャー Realtime Voice Changer