-
Johns Hopkins University
- Baltimore, Maryland, U.S.
-
20:20
(UTC -04:00) - https://williamium3000.github.io/
Highlights
- Pro
Lists (25)
Sort Name ascending (A-Z)
2d generation
a list of 2d vision perception & generative model repos3d generation & graphics
a list of 3d vision generation & computer graphics repos3d perception
a list of 3d vision perception reposai4sci
a list of ai4sci reposautomous driving
dataset distill
emerging innovation
federated
general learning
general learning including 1. noisy label 2. continual learningknowledge distill
medical
multi-modality
any repo that involves multiple modalitynlp & llm
paper list
quality-assessment
remote-sensing
robotics
scene understanding
a list of 2d scene understanding reposself supervised
a list of self supervised learning repossemi-supervised
synthetic
tools
transformer
trustworthy
a list of trustworthy (robustness & generalization & trustworthy) ML reposvision basics
Starred repositories
A list of video object segmentation (VOS) papers
🔖 Curated list of video object segmentation (VOS) papers, datasets, and projects.
MINT-1T: A one trillion token multimodal interleaved dataset.
Grounded SAM 2: Ground and Track Anything in Videos with Grounding DINO, Florence-2 and SAM 2
[ICCV 2023, Official Code] for paper "Exploring Video Quality Assessment on User Generated Contents from Aesthetic and Technical Perspectives". Official Weights and Demos provided.
LaVie: High-Quality Video Generation with Cascaded Latent Diffusion Models
This list of writing prompts covers a range of topics and tasks, including brainstorming research ideas, improving language and style, conducting literature reviews, and developing research plans.
A simple pip-installable Python tool to generate your own HTML citation world map from your Google Scholar ID.
[ICML 2024 Best Paper] Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution (https://arxiv.org/abs/2310.16834)
Command-line program to download videos from YouTube.com and other video sites
A high-performance Python-based I/O system for large (and small) deep learning problems, with strong support for PyTorch.
Easily create large video dataset from video urls
A collection of awesome video generation studies.
A work list of recent human video generation method. This repository focus on half/full body human video generation method, The Nerf, Gaussian splashing, Motion Pose, and talking head/Portrait is n…
A collection of resources on digital human including clothed people digitalization, virtual try-on, and other related directions.
A curated list of awesome resources for salient object detection (SOD), focusing more on multi-modal SOD, such as RGB-D SOD.
Paper, dataset and code list for multimodal dialogue.
[NeurIPS 2024 D&B Track] An official implementation of ShareGPT4Video: Improving Video Understanding and Generation with Better Captions
✨✨Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.
A comprehensive list of Implicit Representations and NeRF papers relating to Robotics/RL domain, including papers, codes, and related websites
A collection of high-quality models for the MuJoCo physics engine, curated by Google DeepMind.