-
-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support to use custom chat template for usage with koboldcpp #3183
Comments
Fyi, the adapter can be sent over the API, but it also accepts plain JSON files that can be loaded independently of any frontend such as open-webui. Here's a sample file: https://github.com/LostRuins/koboldcpp/wiki#what-is---chatcompletionsadapter When launching KoboldCpp, you can apply this adapter manually using this technique. It can also be selected in the GUI loader. Or if using CLI, with the Of course, this does not preclude sending an adapter over the API, and if an adapter is sent over the API it will take precedence over the one selected on load. |
Sorry if out of topic of issue but are openwebui support koboldcpp and horde api? |
Koboldcpp provides a OpenAI compatible endpoint for completions and chat completions, so it should. |
On a side note, I've noticed many gguf's have a jinja2 based template value in them, that at least llama-cpp-python uses. Does koboldcpp support or plan to support that? If so, that could be one workaround for this problem. |
Is your feature request related to a problem? Please describe.
Koboldcpp uses Alpaca as the default chat template for all v1 api completion requests, causing models with other chat templates to not work correctly.
Describe the solution you'd like
Koboldcpps work around is described here: LostRuins/koboldcpp#466 and https://github.com/LostRuins/koboldcpp/pull/466/files/fb6b9a8c41e8e8000fb18e119be3704c646ba55e.
It would be great if we could specify a custom chat template for a model to use with koboldcpp.
Describe alternatives you've considered
I currently hardcoded the chat template for Llama-3 into backend/apps/openai/main.py: async def generate_chat_completion() adding this code:
payload["adapter"] = {"system_start":"<|start_header_id|>system<|end_header_id|>\n\n", "system_end":"<|eot_id|>", "user_start":"<|start_header_id|>user<|end_header_id|>\n\n", "user_end":"<|eot_id|>", "assistant_start":"<|start_header_id|>assistant<|end_header_id|>\n\n", "assistant_end":"<|eot_id|>"}
The text was updated successfully, but these errors were encountered: