-
Hong Kong University of Science and Technology
- Hong Kong
-
13:20
(UTC 08:00) - https://xutong.tech
Highlights
- Pro
Stars
Wireguard client that exposes itself as a socks5 proxy
Distribute and run LLMs with a single file.
Calculate token/s & GPU memory requirement for any LLM. Supports llama.cpp/ggml/bnb/QLoRA quantization
Scripts for fine-tuning Meta Llama with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting…
Reaching LLaMA2 Performance with 0.1M Dollars
High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
Get up and running with Llama 3.2, Mistral, Gemma 2, and other large language models.
A quick guide (especially) for trending instruction finetuning datasets
vpnc-script replacement for easy and secure split-tunnel VPN setup
microsoft / Megatron-DeepSpeed
Forked from NVIDIA/Megatron-LMOngoing research training transformer language models at scale, including: BERT & GPT-2
提取微信聊天记录,将其导出成HTML、Word、Excel文档永久保存,对聊天记录进行分析生成年度聊天报告,用聊天数据训练专属于个人的AI聊天助手
MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
how to optimize some algorithm in cuda.
Machine Learning Engineering Open Book
Optimizing SGEMM kernel functions on NVIDIA GPUs to a close-to-cuBLAS performance.
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…
Parallel GDB developed for debugging HPC code at Lawrence Livermore National Laboratory.
AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.
Bash script for Ubuntu (and derivatives) to easily (un)install kernels from the Ubuntu Kernel PPA
Universal LLM Deployment Engine with ML Compilation