Add support to use custom chat template for usage with koboldcpp #3183

Tureti · 2024-06-15T00:50:43Z

Is your feature request related to a problem? Please describe.
Koboldcpp uses Alpaca as the default chat template for all v1 api completion requests, causing models with other chat templates to not work correctly.

Describe the solution you'd like
Koboldcpps work around is described here: LostRuins/koboldcpp#466 and https://github.com/LostRuins/koboldcpp/pull/466/files/fb6b9a8c41e8e8000fb18e119be3704c646ba55e.
It would be great if we could specify a custom chat template for a model to use with koboldcpp.

Describe alternatives you've considered
I currently hardcoded the chat template for Llama-3 into backend/apps/openai/main.py: async def generate_chat_completion() adding this code:
payload["adapter"] = {"system_start":"<|start_header_id|>system<|end_header_id|>\n\n", "system_end":"<|eot_id|>", "user_start":"<|start_header_id|>user<|end_header_id|>\n\n", "user_end":"<|eot_id|>", "assistant_start":"<|start_header_id|>assistant<|end_header_id|>\n\n", "assistant_end":"<|eot_id|>"}

LostRuins · 2024-06-15T04:16:04Z

Fyi, the adapter can be sent over the API, but it also accepts plain JSON files that can be loaded independently of any frontend such as open-webui.

Here's a sample file: https://github.com/LostRuins/koboldcpp/wiki#what-is---chatcompletionsadapter

When launching KoboldCpp, you can apply this adapter manually using this technique. It can also be selected in the GUI loader.

Or if using CLI, with the --chatcompletionsadapter launch flag.

Of course, this does not preclude sending an adapter over the API, and if an adapter is sent over the API it will take precedence over the one selected on load.

tukangcode · 2024-06-15T10:29:36Z

Sorry if out of topic of issue but are openwebui support koboldcpp and horde api?

LostRuins · 2024-06-15T10:53:01Z

Koboldcpp provides a OpenAI compatible endpoint for completions and chat completions, so it should.

TheTerrasque · 2024-06-15T13:03:59Z

On a side note, I've noticed many gguf's have a jinja2 based template value in them, that at least llama-cpp-python uses. Does koboldcpp support or plan to support that? If so, that could be one workaround for this problem.

Tureti closed this as not planned Won't fix, can't repro, duplicate, stale Jun 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support to use custom chat template for usage with koboldcpp #3183

Add support to use custom chat template for usage with koboldcpp #3183

Tureti commented Jun 15, 2024 •

edited

Loading

LostRuins commented Jun 15, 2024

tukangcode commented Jun 15, 2024

LostRuins commented Jun 15, 2024

TheTerrasque commented Jun 15, 2024

Add support to use custom chat template for usage with koboldcpp #3183

Add support to use custom chat template for usage with koboldcpp #3183

Comments

Tureti commented Jun 15, 2024 • edited Loading

LostRuins commented Jun 15, 2024

tukangcode commented Jun 15, 2024

LostRuins commented Jun 15, 2024

TheTerrasque commented Jun 15, 2024

Tureti commented Jun 15, 2024 •

edited

Loading