Skip to content
View yqy2001's full-sized avatar

Organizations

@baaivision

Block or report yqy2001

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Stars

LLM

46 repositories

What would you do with 1000 H100s...

Jupyter Notebook 892 52 Updated Jan 10, 2024

Grok open release

Python 49,493 8,323 Updated Aug 30, 2024

A PyTorch Native LLM Training Framework

Python 639 33 Updated Aug 25, 2024

🙌 OpenHands: Code Less, Make More

Python 33,122 3,793 Updated Oct 21, 2024

[ICLR 2024] SWE-Bench: Can Language Models Resolve Real-world Github Issues?

Python 1,846 318 Updated Oct 17, 2024

An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries

Python 6,900 1,003 Updated Oct 17, 2024

A framework for few-shot evaluation of language models.

Python 6,740 1,794 Updated Oct 20, 2024

MiniCPM3-4B: An edge-side LLM that surpasses GPT-3.5-Turbo.

Jupyter Notebook 7,045 445 Updated Oct 10, 2024

PyTorch native finetuning library

Python 4,184 404 Updated Oct 20, 2024

Get up and running with Llama 3.2, Mistral, Gemma 2, and other large language models.

Go 94,094 7,444 Updated Oct 21, 2024

CoreNet: A library for training deep neural networks

Jupyter Notebook 6,965 539 Updated Oct 14, 2024

Robust recipes to align language models with human and AI preferences

Python 4,602 402 Updated Oct 7, 2024

Modeling, training, eval, and inference code for OLMo

Python 4,538 456 Updated Oct 21, 2024

Evaluation suite for LLMs

Python 300 38 Updated Jun 13, 2024

[ICML 2024] Selecting High-Quality Data for Training Language Models

Python 136 10 Updated Jun 20, 2024

Code accompanying the paper "Massive Activations in Large Language Models"

Python 113 8 Updated Mar 4, 2024
Jupyter Notebook 10 Updated Apr 3, 2023

Minimalistic large language model 3D-parallelism training

Python 1,180 113 Updated Oct 9, 2024

LLM101n: Let's build a Storyteller

29,420 1,608 Updated Aug 1, 2024
Python 438 44 Updated Oct 7, 2024

Easily embed, cluster and semantically label text datasets

Python 452 36 Updated Mar 28, 2024

The official implementation of Self-Play Fine-Tuning (SPIN)

Python 1,014 90 Updated May 8, 2024

Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.

Shell 8,979 558 Updated Oct 15, 2024

The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery 🧑‍🔬

Jupyter Notebook 7,852 1,083 Updated Oct 21, 2024

aider is AI pair programming in your terminal

Python 20,679 1,905 Updated Oct 16, 2024

LongWriter: Unleashing 10,000 Word Generation from Long Context LLMs

Python 1,427 118 Updated Sep 27, 2024

Scalable toolkit for efficient model alignment

Python 572 68 Updated Oct 20, 2024

Ongoing research training transformer models at scale

Python 10,339 2,315 Updated Oct 19, 2024