Stars
MusePose: a Pose-Driven Image-to-Video Framework for Virtual Human Generation
DynamicPose, a simple and robust framework for animating human images.
High-Resolution Image Synthesis with Latent Diffusion Models
Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.
Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.
llama3 implementation one matrix multiplication at a time
Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!
[CVPR'24] DiffSHEG: A Diffusion-Based Approach for Real-Time Speech-driven Holistic 3D Expression and Gesture Generation
Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation
GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型
A generative speech model for daily dialogue.
A high-throughput and memory-efficient inference and serving engine for LLMs
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。
Faster Whisper transcription with CTranslate2
Robust Speech Recognition via Large-Scale Weak Supervision
OpenMMLab Model Compression Toolbox and Benchmark.
[ICLR 2020] Contrastive Representation Distillation (CRD), and benchmark of recent knowledge distillation methods
《开源大模型食用指南》基于Linux环境快速部署开源大模型,更适合中国宝宝的部署教程
A WebGL accelerated JavaScript library for training and deploying ML models.
OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference
Fay is an open-source digital human framework integrating language models and digital characters. It offers retail, assistant, and agent versions for diverse applications like virtual shopping guid…
Digital Human Resource Collection: 2D/3D/4D human modeling, avatar generation & animation, clothed people digitalization, virtual try-on, and others.
[CVPR 2024] CoSeR: Bridging Image and Language for Cognitive Super-Resolution
Implementation of paper - YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information
[CVPR2024] NeuRAD: Neural Rendering for Autonomous Driving