A high-throughput and memory-efficient inference and serving engine for LLMs
-
Updated
Jul 27, 2024 - Python
A high-throughput and memory-efficient inference and serving engine for LLMs
Open deep learning compiler stack for cpu, gpu and specialized accelerators
Stable Diffusion web UI
PygmalionAI's large-scale inference engine
vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs
ROCm Install Utilities: rocminstall.py script to install a specific ROCm release version/revision.
TOML-annotated C header file format for packaging binary files, from Microsoft Research
Voice-to-voice personal assistant, Full-local, GPU company agnostic.
This project has scripts to set up, build and test installation of AMD ROCm MIVisionX
Instructions on how to use PyTorch on AMD GPU with Linux
MIVisionX Python Inference Analyzer uses pre-trained ONNX/NNEF/Caffe models to analyze inference results and summarize individual image results
同步AI开发常用的docker镜像到阿里云镜像仓库,便于在国内快速拉取镜像。如:pytorch
MIVisionX Infrastructure for Neural Net Training and Inference with Optimized Data Augmentation through RALI
Add a description, image, and links to the rocm topic page so that developers can more easily learn about it.
To associate your repository with the rocm topic, visit your repo's landing page and select "manage topics."