Stars
[ECCV'24] TalkingGaussian: Structure-Persistent 3D Talking Head Synthesis via Gaussian Splatting
open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.
ECCV2022 - Real-Time Intermediate Flow Estimation for Video Frame Interpolation
Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditioning
A comprehensive list of recources (papers, repositories etc.) about face restoration methods.
MuseV: Infinite-length and High Fidelity Virtual Human Video Generation with Visual Conditioned Parallel Denoising
📖 A curated list of resources dedicated to talking face.
MuseTalk: Real-Time High Quality Lip Synchorization with Latent Space Inpainting
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)
Zero-Shot Speech Editing and Text-to-Speech in the Wild
Incredibly descriptive audiovisual summaries for videos
InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output
Mixture-of-Experts for Large Vision-Language Models
[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
We write your reusable computer vision tools. 💜
Implementation of paper - YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information
Official repository of "Investigating Tradeoffs in Real-World Video Super-Resolution"
VRT: A Video Restoration Transformer (official repository)
Official codes of DiffBIR: Towards Blind Image Restoration with Generative Diffusion Prior
[CVPR 2024] FaceTalk: Audio-Driven Motion Diffusion for Neural Parametric Head Models
Generative Models by Stability AI
GeneFace : Generalized and Stable Real-Time 3D Talking Face Generation; Official Code
GeneFace: Generalized and High-Fidelity 3D Talking Face Synthesis; ICLR 2023; Official code