Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support to use custom chat template for usage with koboldcpp #3183

Closed
Tureti opened this issue Jun 15, 2024 · 4 comments
Closed

Add support to use custom chat template for usage with koboldcpp #3183

Tureti opened this issue Jun 15, 2024 · 4 comments

Comments

@Tureti
Copy link

Tureti commented Jun 15, 2024

Is your feature request related to a problem? Please describe.
Koboldcpp uses Alpaca as the default chat template for all v1 api completion requests, causing models with other chat templates to not work correctly.

Describe the solution you'd like
Koboldcpps work around is described here: LostRuins/koboldcpp#466 and https://github.com/LostRuins/koboldcpp/pull/466/files/fb6b9a8c41e8e8000fb18e119be3704c646ba55e.
It would be great if we could specify a custom chat template for a model to use with koboldcpp.

Describe alternatives you've considered
I currently hardcoded the chat template for Llama-3 into backend/apps/openai/main.py: async def generate_chat_completion() adding this code:
payload["adapter"] = {"system_start":"<|start_header_id|>system<|end_header_id|>\n\n", "system_end":"<|eot_id|>", "user_start":"<|start_header_id|>user<|end_header_id|>\n\n", "user_end":"<|eot_id|>", "assistant_start":"<|start_header_id|>assistant<|end_header_id|>\n\n", "assistant_end":"<|eot_id|>"}

@LostRuins
Copy link

Fyi, the adapter can be sent over the API, but it also accepts plain JSON files that can be loaded independently of any frontend such as open-webui.

Here's a sample file: https://github.com/LostRuins/koboldcpp/wiki#what-is---chatcompletionsadapter

When launching KoboldCpp, you can apply this adapter manually using this technique. It can also be selected in the GUI loader.

image

Or if using CLI, with the --chatcompletionsadapter launch flag.

Of course, this does not preclude sending an adapter over the API, and if an adapter is sent over the API it will take precedence over the one selected on load.

@tukangcode
Copy link

Sorry if out of topic of issue but are openwebui support koboldcpp and horde api?

@LostRuins
Copy link

Koboldcpp provides a OpenAI compatible endpoint for completions and chat completions, so it should.

@TheTerrasque
Copy link

On a side note, I've noticed many gguf's have a jinja2 based template value in them, that at least llama-cpp-python uses. Does koboldcpp support or plan to support that? If so, that could be one workaround for this problem.

@Tureti Tureti closed this as not planned Won't fix, can't repro, duplicate, stale Jun 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants