mystijk

mystijk

8 followers · 16 following

Lists (32)

Sort

Stars

thunlp / LEGENT

Open Platform for Embodied Agents

Python 268 15 Updated Oct 13, 2024

cassiniR / FunASR

Forked from modelscope/FunASR

A Fundamental End-to-End Speech Recognition Toolkit

Python 1 Updated Oct 11, 2024

zakahan / MMeRAG

MMeRAG is an open-source RAG (Retrieval-Augmented Generation), Provides a parser for audio and video data to implement RAG for audio and video. MMeRAG是一个开源的RAG项目，提供了一种用于音频和视频数据的解析器，用来实现音视频的RAG。

Python 5 Updated Sep 24, 2024

cpuimage / SimpleAudioDenoise

A Simple and Efficient Implementation Of Fast Fourier Transform For Audio Denoise

C 100 57 Updated Aug 11, 2020

OpenMOSS / AnyGPT

Code for "AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling"

Python 776 61 Updated Aug 27, 2024

SmartFlowAI / EmoLLM

心理健康大模型、LLM、The Big Model of Mental Health、Finetune、InternLM2、InternLM2.5、Qwen、ChatGLM、Baichuan、DeepSeek、Mixtral、LLama3、GLM4、Qwen2、LLama3.1

Python 839 121 Updated Oct 21, 2024

FireRedTeam / FireRedTTS

An Open-Sourced LLM-empowered Foundation TTS System

Python 429 29 Updated Oct 17, 2024

ggg0919 / cantor

HTML 67 7 Updated May 10, 2024

DAMO-NLP-SG / VideoLLaMA2

VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs

Python 867 60 Updated Nov 4, 2024

okaris / omni-zero-couples

A diffusers pipeline for zero shot stylised couples portrait creation

Python 90 9 Updated Sep 25, 2024

lifeiteng / OmniSenseVoice

Omni SenseVoice: High-Speed Speech Recognition with words timestamps 🗣️🎯

Python 725 29 Updated Nov 6, 2024

BradyFU / Awesome-Multimodal-Large-Language-Models

✨✨Latest Advances on Multimodal Large Language Models

12,607 806 Updated Nov 10, 2024

neu-vi / OmniControl

OmniControl: Control Any Joint at Any Time for Human Motion Generation, ICLR 2024

Python 243 17 Updated Jun 14, 2024

ziyc / drivestudio

A 3DGS framework for omni urban scene reconstruction and simulation.

Python 574 46 Updated Sep 6, 2024

hhoangphuoc / SpeechLaughRecogniser

An ASR model for transcribing laughter and speech-laugh in conversational speech

Python 1 Updated Nov 12, 2024

Henry-23 / VideoChat

实时语音交互数字人，支持端到端语音方案（GLM-4-Voice - THG）和级联方案（ASR-LLM-TTS-THG）。可自定义形象与音色，无须训练，支持音色克隆，首包延迟低至3s。Real-time voice interactive digital human, supporting end-to-end voice solutions (GLM-4-Voice - THG) and …

Python 308 36 Updated Nov 8, 2024

ZhengdiYu / SignAvatars

SignAvatars: A Large-scale 3D Sign Language Holistic Motion Dataset and Benchmark

63 2 Updated Nov 5, 2024

YuliangXiu / PuzzleAvatar

[SIGGRAPH Asia 2024] PuzzleAvatar: Assembling 3D Avatars from Personal Albums

Python 243 10 Updated Nov 12, 2024

heawon-yoon / anim-gaussian

Animatable Gaussian textured Avatar

Python 46 2 Updated Jun 24, 2024

OpenTalker / ToonTalker

[ICCV 2023]ToonTalker: Cross-Domain Face Reenactment

Python 104 8 Updated Oct 29, 2024

IDEA-Research / HumanTOMATO

[ICML 2024] 🍅HumanTOMATO: Text-aligned Whole-body Motion Generation

Python 288 8 Updated Jun 19, 2024

CvHadesSun / FLame2SMPLX

A tool to tranform the flame texture space,shape and pose paramerter into SMPL or SMPLX model 's head(or face).

Python 35 2 Updated Mar 22, 2024

gpt-omni / mini-omni2

Towards Open-source GPT-4o with Vision, Speech and Duplex Capabilities。

Python 1,540 183 Updated Nov 6, 2024

Lightning-AI / litgpt

20 high-performance LLMs with recipes to pretrain, finetune and deploy at scale.

Python 10,668 1,062 Updated Nov 11, 2024

neeek2303 / EMOPortraits

Official implementation of EMOPortraits: Emotion-enhanced Multimodal One-shot Head Avatars

Jupyter Notebook 303 17 Updated Oct 6, 2024

XichongLing / 3dgs-avatar

Python 1 Updated Sep 19, 2024

alibaba-yuanjing-aigclab / ViViD

ViViD: Video Virtual Try-on using Diffusion Models

Python 463 31 Updated Jun 21, 2024

Fictionarry / TalkingGaussian

[ECCV'24] TalkingGaussian: Structure-Persistent 3D Talking Head Synthesis via Gaussian Splatting

Python 256 33 Updated Jul 30, 2024

hrithikkoduri / TalkQL

Talk to your database as if you were chatting with a friend. Turn natural language into powerful SQL queries effortlessly, and get your answers back in a language you understand. No technical jargo…

TypeScript 4 Updated Nov 12, 2024

rrangith / RealTalk

First Place Winner at Delta Hacks 5. Analyses speech, hand gestures, and facial expressions and gives both real-time feedback as well as a summary of results at the end.

Python 37 5 Updated Dec 10, 2022

mystijk

Lists (32)

3D模型与纹理结合的渲染

3D模型互转工具

3D表达--不一样的思路（无需smpl）

3维重建

CV综述

openMVG--重建

smpl

some vton data prepare tool

三维重建--多照片联合重建

乐谱工具

人与环境的交互，比如人与环境的重叠点灯。

人体分割

人像抠图

体型SMPL提取尺寸信息

动图生成

号称效果比较好的re-texture

图像/视频高清化

图像显示特征提取

姿态重建

对原图片的新视角打光

工程化代码

年龄变化

换其它的

数据集

暂时不知道，看起来名气不错

服装动态自然效果

服装建模重建

点云生成

自动换衣<VTON-liked>

虚拟试衣论文

重建人体

音乐识谱

Stars