Skip to content
View gyeongchan-yun's full-sized avatar

Block or report gyeongchan-yun

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Dynamic resources changes for multi-dimensional parallelism training

Go 9 1 Updated Nov 11, 2024

The official implementation of the paper "What Matters in Transformers? Not All Attention is Needed".

Python 135 14 Updated Nov 17, 2024

Official inference framework for 1-bit LLMs

C 11,151 757 Updated Nov 11, 2024

Proteus: A High-Throughput Inference-Serving System with Accuracy Scaling

Python 8 2 Updated Mar 7, 2024
Python 39 3 Updated Sep 26, 2024
Python 29 10 Updated Jul 4, 2024

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

Jupyter Notebook 32,862 3,951 Updated Nov 16, 2024

A collection of (mostly) technical things every software developer should know about

83,480 7,788 Updated Aug 6, 2024

Short code snippets for all your development needs

JavaScript 121,804 12,035 Updated Nov 16, 2024

Overall architectures of tech stack.

3 Updated Jun 3, 2023

"JABAS: Joint Adaptive Batching and Automatic Scaling for DNN Training on Heterogeneous GPUs" (EuroSys '25)

Python 12 Updated Sep 23, 2024
Python 12 Updated Sep 23, 2024

LLMServingSim: A HW/SW Co-Simulation Infrastructure for LLM Inference Serving at Scale

Python 58 8 Updated Oct 24, 2024

💯 Curated coding interview preparation materials for busy software engineers

TypeScript 119,069 14,731 Updated Oct 8, 2024

nnScaler: Compiling DNN models for Parallel Training

Python 71 12 Updated Oct 25, 2024

Since the emergence of chatGPT in 2022, the acceleration of Large Language Model has become increasingly important. Here is a list of papers on accelerating LLMs, currently focusing mainly on infer…

175 7 Updated Nov 5, 2024

[USENIX ATC '24] Accelerating the Training of Large Language Models using Efficient Activation Rematerialization and Optimal Hybrid Parallelism

Python 45 1 Updated Jul 31, 2024

(NeurIPS 2022) Automatically finding good model-parallel strategies, especially for complex models and clusters.

Python 34 6 Updated Nov 4, 2022

Training and serving large-scale neural networks with auto parallelization.

Python 3,077 358 Updated Dec 9, 2023

Artifact for DynaPipe: Optimizing Multi-task Training through Dynamic Pipelines

Python 1 3 Updated Nov 9, 2023

A collection of design patterns/idioms in Python

Python 40,498 6,946 Updated Sep 5, 2024

paper and its code for AI System

211 13 Updated Aug 29, 2024

Zero Bubble Pipeline Parallelism

Python 280 14 Updated Nov 14, 2024

Official repository for the paper DynaPipe: Optimizing Multi-task Training through Dynamic Pipelines

Python 14 2 Updated Dec 8, 2023

[ASPLOS'23] Optimus-CC: Efficient Large NLP Model Training with 3D Parallelism Aware Communication Compression

Python 3 4 Updated Oct 19, 2022

Galvatron is an automatic distributed training system designed for Transformer models, including Large Language Models (LLMs). If you have any interests, please visit/star/fork https://github.com/P…

Python 14 7 Updated Jul 11, 2024
Python 5 Updated Oct 21, 2024

InternEvo is an open-sourced lightweight training framework aims to support model pre-training without the need for extensive dependencies.

Python 309 52 Updated Nov 7, 2024
Jupyter Notebook 129 7 Updated Mar 12, 2024
Next