Stars
Distributed Asynchronous Hyperparameter Optimization in Python
InfiniGen: Efficient Generative Inference of Large Language Models with Dynamic KV Cache Management (OSDI'24)
📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.
A low-latency & high-throughput serving engine for LLMs
A scalable sharding solution for Blockchain based Federated Learning. SCaFL or ScaleSFL?
收集所有区块链(BlockChain)技术开发相关资料,包括Fabric和Ethereum开发资料
Comprehensive and timely academic information on federated learning (papers, frameworks, datasets, tutorials, workshops)
The existing blockchain-related academic papers. All papers are sorted according to the conference and published year. Welcome developers or researchers to add more published papers to this list.
SOFARPC is a high-performance, high-extensibility, production-level Java RPC framework.
go-chat.使用Go基于WebSocket开发的web聊天应用。单聊,群聊。文字,图片,语音,视频消息,屏幕共享,剪切板图片,基于WebRTC的P2P语音通话,视频聊天。
Apollo is a reliable configuration management system suitable for microservice configuration management scenarios.
7 days golang programs from scratch (web framework Gee, distributed cache GeeCache, object relational mapping ORM framework GeeORM, rpc framework GeeRPC etc) 7天用Go动手写/从零实现系列
ByConity is an open source cloud data warehouse
Transformer related optimization, including BERT, GPT
Curated list of project-based tutorials
SkyPilot: Run AI and batch jobs on any infra (Kubernetes or 12 clouds). Get unified execution, cost savings, and high GPU availability via a simple interface.
SpotServe: Serving Generative Large Language Models on Preemptible Instances
Basic Sources for MIT 6.824 Distributed Systems Class
ChatGPT 中文调教指南。各种场景使用指南。学习怎么让它听你的话。