Scripts for fine-tuning Meta Llama with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting…

Jupyter Notebook 14,610 2,125 Updated Nov 1, 2024

myshell-ai / JetMoE

Reaching LLaMA2 Performance with 0.1M Dollars

Python 961 79 Updated Jul 23, 2024

FMInference / DejaVu

Python 282 35 Updated Apr 2, 2024

yakGPT / yakGPT

Locally running, hands-free ChatGPT UI

TypeScript 1,599 257 Updated May 8, 2024

SJTU-IPADS / PowerInfer

High-speed Large Language Model Serving on PCs with Consumer-grade GPUs

C 7,947 410 Updated Sep 6, 2024

ollama / ollama

Get up and running with Llama 3.2, Mistral, Gemma 2, and other large language models.

Go 95,883 7,609 Updated Nov 1, 2024

Zjh-819 / LLMDataHub

A quick guide (especially) for trending instruction finetuning datasets

2,573 167 Updated Nov 28, 2023

dlenski / vpn-slice

vpnc-script replacement for easy and secure split-tunnel VPN setup

Python 742 87 Updated Sep 5, 2024

microsoft / Megatron-DeepSpeed

Forked from NVIDIA/Megatron-LM

Ongoing research training transformer language models at scale, including: BERT & GPT-2

Python 1,873 342 Updated Oct 18, 2024

LC044 / WeChatMsg

提取微信聊天记录，将其导出成HTML、Word、Excel文档永久保存，对聊天记录进行分析生成年度聊天报告，用聊天数据训练专属于个人的AI聊天助手

Python 34,241 3,583 Updated Sep 23, 2024

microsoft / DeepSpeed-MII

MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.

Python 1,884 176 Updated Oct 31, 2024

karpathy / llama2.c

Inference Llama 2 in one file of pure C

C 17,419 2,078 Updated Aug 6, 2024

ModelTC / lightllm

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

Python 2,565 201 Updated Nov 1, 2024

AgentOps-AI / BestGPTs

Top ranked OpenAI GPTs

970 70 Updated Mar 13, 2024

BBuf / how-to-optim-algorithm-in-cuda

how to optimize some algorithm in cuda.

Cuda 1,561 128 Updated Nov 1, 2024

stas00 / ml-engineering

Machine Learning Engineering Open Book

Python 11,552 703 Updated Nov 1, 2024

yzhaiustc / Optimizing-SGEMM-on-NVIDIA-Turing-GPUs

Optimizing SGEMM kernel functions on NVIDIA GPUs to a close-to-cuBLAS performance.

Cuda 273 43 Updated Nov 28, 2021

NVIDIA / TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…

C 8,546 970 Updated Nov 1, 2024

ndryden / PGDB

Parallel GDB developed for debugging HPC code at Lawrence Livermore National Laboratory.

Python 32 6 Updated Nov 3, 2015

facebookincubator / AITemplate

AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.

Python 4,551 369 Updated Oct 23, 2024

pimlie / ubuntu-mainline-kernel.sh

Bash script for Ubuntu (and derivatives) to easily (un)install kernels from the Ubuntu Kernel PPA

Shell 861 103 Updated Jan 23, 2024

mlc-ai / mlc-llm

Universal LLM Deployment Engine with ML Compilation

Python 19,094 1,566 Updated Nov 2, 2024

ngosang / trackerslist

Updated list of public BitTorrent trackers

47,035 6,584 Updated Nov 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tong XU VisionTheta

Achievements

Achievements

Highlights

Organizations

Block or report VisionTheta

Stars

corbt / agent.exe

pufferffish / wireproxy

ml-explore / mlx-swift-examples

gpu-mode / lectures

Mozilla-Ocho / llamafile

axolotl-ai-cloud / axolotl

RahulSChand / gpu_poor

meta-llama / llama-recipes