Skip to content
View mustious's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report mustious

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support

Python 7,781 942 Updated Oct 7, 2024

GPUGrants - a list of GPU grants that I can think of

5 Updated Aug 2, 2024

Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.

Jupyter Notebook 37,839 3,981 Updated Jul 28, 2024

Ongoing research training transformer models at scale

Python 10,186 2,290 Updated Oct 5, 2024

Development repository for the Triton language and compiler

C 12,950 1,574 Updated Oct 7, 2024
Python 1,242 85 Updated Sep 24, 2024
JavaScript 1 Updated Mar 10, 2024
JavaScript 1 Updated Sep 25, 2024
JavaScript 1 Updated Jun 9, 2024

Solve puzzles. Learn CUDA.

Jupyter Notebook 9,425 850 Updated Sep 1, 2024
Python 1,179 170 Updated Sep 19, 2024

GRadient-INformed MoE

250 15 Updated Sep 25, 2024

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Python 36,575 5,758 Updated Aug 19, 2024

Building blocks for foundation models.

369 13 Updated Jan 3, 2024

OLMoE: Open Mixture-of-Experts Language Models

Jupyter Notebook 405 30 Updated Sep 17, 2024

Efficient Triton Kernels for LLM Training

Python 3,140 166 Updated Oct 5, 2024

⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training (EMNLP 2024)

Python 861 46 Updated Jun 25, 2024

A simple, performant and scalable Jax LLM!

Python 1,486 279 Updated Oct 7, 2024

A curated reading list of research in Adaptive Computation, Inference-Time Computation & Mixture of Experts (MoE).

124 7 Updated Aug 9, 2024

Optax is a gradient processing and optimization library for JAX.

Python 1,651 182 Updated Oct 7, 2024

PyTorch extensions for high performance and large scale training.

Python 3,167 279 Updated Aug 30, 2024

100 exercises to learn JAX

Jupyter Notebook 564 45 Updated Jun 11, 2022

Flexible and powerful tensor operations for readable and reliable code (for pytorch, jax, TF and others)

Python 8,409 350 Updated Sep 21, 2024

Fast and memory-efficient exact attention

Python 13,656 1,252 Updated Oct 6, 2024

The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery 🧑‍🔬

Jupyter Notebook 7,726 1,057 Updated Sep 10, 2024

🚴 Call stack profiler for Python. Shows you why your code is slow!

Python 6,487 230 Updated Oct 7, 2024

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

Python 30,272 6,385 Updated Oct 3, 2024

Agentic components of the Llama Stack APIs

Python 3,691 549 Updated Oct 4, 2024

Multi-Agent Reinforcement Learning with JAX

Python 409 71 Updated Oct 2, 2024
Next