EmbeddedLLM: API server for Embedded Device Deployment. Currently support CUDA/OpenVINO/IpexLLM/DirectML/CPU
windows cpu llama gemma mistral directx-12 openvino npu openvino-inference-engine aipc directml llm model-inference llm-serving llm-inference open-source-llm phi-3 ipexllm
-
Updated
Oct 6, 2024 - Python