Stars
[arXiv 2024] This is the official implementation of paper "Robots Pre-Train Robots: Manipulation-Centric Robotic Representation from Large-Scale Robot Datasets".
Stream VR games from your PC to your headset via Wi-Fi
A toolkit for making real world machine learning and data analysis applications in C
A collection of high-quality models for the MuJoCo physics engine, curated by Google DeepMind.
Janus: Decoupling Visual Encoding for Unified Multimodal Understanding and Generation
DynaMo: In-Domain Dynamics Pretraining for Visuo-Motor Control
This code corresponds to simulation environments used as part of the MimicGen project.
Octo is a transformer-based robot policy trained on a diverse mix of 800k robot trajectories.
[arXiv 2024] Generalizable Humanoid Manipulation with Improved 3D Diffusion Policies. Part 1: Train & Deploy of iDP3
RDT-1B: a Diffusion Foundation Model for Bimanual Manipulation
DIAMOND (DIffusion As a Model Of eNvironment Dreams) is a reinforcement learning agent trained in a diffusion world model. NeurIPS 2024 Spotlight.
Educational framework exploring ergonomic, lightweight multi-agent orchestration. Managed by OpenAI Solution team.
Official Implementation of paper "MonST3R: A Simple Approach for Estimating Geometry in the Presence of Motion"
Solve puzzles. Improve your pytorch.
Paper list in the survey paper: Toward General-Purpose Robots via Foundation Models: A Survey and Meta-Analysis
rewind.ai x cursor.com = your AI assistant that has all the context
Official Implementation of LOTUS: Diffusion-based Visual Foundation Model for High-quality Dense Prediction
Universal Manipulation Interface: In-The-Wild Robot Teaching Without In-The-Wild Robots
A Vision-Language Model for Spatial Affordance Prediction in Robotics
[ECCV 2024] Improving 2D Feature Representations by 3D-Aware Fine-Tuning
Official implementation of ICCV 2023 paper "3D-VisTA: Pre-trained Transformer for 3D Vision and Text Alignment"
A PyTorch implementation of Perceiver, Perceiver IO and Perceiver AR with PyTorch Lightning scripts for distributed training
streamline the fine-tuning process for multimodal models: PaliGemma, Florence-2, and Qwen2-VL