feat: Add inference engine - NVIDIA triton inference server and TRT-LLM #821
Labels
engineering: Jan Inference Layer
Jan can serve models locally: with correct data structs, APIs, multi-inference engines, multi-model
P1: important
Important feature / fix
type: feature request
A new feature
Problem
I have an existing NVIDIA triton inference server with TensorRT-LLM as backend. I want to use that model within Jan
Success Criteria
nvidia-inference-engine-trt-llm/engine.json
model.json
for llama2-7bAdditional context
The text was updated successfully, but these errors were encountered: