Skip to content
View FranxYao's full-sized avatar

Block or report FranxYao

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Efficient Triton Kernels for LLM Training

Python 3,431 202 Updated Nov 17, 2024

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

Python 2,609 205 Updated Nov 16, 2024

An Easy-to-use, Scalable and High-performance RLHF Framework (70B PPO Full Tuning & Iterative DPO & LoRA & RingAttention)

Python 2,637 248 Updated Nov 17, 2024

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

1,118 25 Updated Jul 31, 2024

Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.

Python 1,836 112 Updated Jul 29, 2024

[EMNLP 2024] LongAlign: A Recipe for Long Context Alignment of LLMs

Python 217 15 Updated Apr 22, 2024

A native PyTorch Library for large model training

Python 2,615 204 Updated Nov 16, 2024

The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.

Python 7,892 465 Updated May 3, 2024

This repo contains the source code for RULER: What’s the Real Context Size of Your Long-Context Language Models?

Python 717 47 Updated Oct 24, 2024
Python 572 27 Updated Feb 15, 2024

S-LoRA: Serving Thousands of Concurrent LoRA Adapters

Python 1,750 98 Updated Jan 21, 2024

Multimodal language model benchmark, featuring challenging examples

Python 149 6 Updated Aug 13, 2024

open-source code for paper: Retrieval Head Mechanistically Explains Long-Context Factuality

Python 159 13 Updated Aug 2, 2024
Python 1,939 165 Updated Oct 31, 2024
Python 187 7 Updated May 1, 2024

OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100 datasets.

Python 4,124 437 Updated Nov 15, 2024

A flexible and efficient codebase for training visually-conditioned language models (VLMs)

Python 471 230 Updated Jul 4, 2024

Finetune Llama 3.2, Mistral, Phi, Qwen 2.5 & Gemma LLMs 2-5x faster with 80% less memory

Python 18,212 1,269 Updated Nov 17, 2024

Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.

Python 645 46 Updated Sep 27, 2024

This repo is to demo the concept of lossless compression with Transformers as encoder and decoder.

Python 14 Updated May 2, 2024

Code for exploring Based models from "Simple linear attention language models balance the recall-throughput tradeoff"

Python 214 14 Updated Aug 16, 2024

[ACL 2024] Long-Context Language Modeling with Parallel Encodings

Python 144 9 Updated Jun 13, 2024

OpenCodeInterpreter is a suite of open-source code generation systems aimed at bridging the gap between large language models and sophisticated proprietary systems like the GPT-4 Code Interpreter. …

Python 1,599 204 Updated May 7, 2024
Python 11 Updated Oct 18, 2023

A library for efficient similarity search and clustering of dense vectors.

C 31,453 3,643 Updated Nov 15, 2024

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.

Python 5,666 514 Updated Oct 18, 2024

[ICML 2024] Break the Sequential Dependency of LLM Inference Using Lookahead Decoding

Python 1,148 67 Updated Oct 14, 2024

Mamba SSM architecture

Python 13,211 1,127 Updated Nov 5, 2024

A framework for few-shot evaluation of language models.

Python 6,981 1,867 Updated Nov 16, 2024
Next