LLM
🚀🧠💬 Supercharged Custom Instructions for ChatGPT (non-coding) and ChatGPT Advanced Data Analysis (coding).
Sparsity-aware deep learning inference runtime for CPUs
A programming framework for agentic AI 🤖
Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads
An LLM playground you can run on your laptop
An innovative library for efficient LLM inference via low-bit quantization
AI模型接口管理与分发系统,支持将多种大模型转为OpenAI格式调用、支持Midjourney Proxy、Suno、Rerank,兼容易支付协议,仅供个人或者企业内部管理与分发渠道使用,请勿用于商业用途,本项目基于One API二次开发。
A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations
Efficient Triton Kernels for LLM Training
An acceleration library that supports arbitrary bit-width combinatorial quantization operations