Stars
High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
MindSpore is a new open source deep learning training/inference framework that could be used for mobile, edge and cloud scenarios.
Repository for the QUIK project, enabling the use of 4bit kernels for generative inference
Awesome LLM compression research papers and tools.