Feature Request: Maintain Context Length for Title Generation Using Same Model for Chat #3106

chrisoutwright · 2024-06-12T20:46:54Z

Is your feature request related to a problem? Please describe.
The current implementation of the title generation functionality in Open WebUI during chat initiation does not maintain context length (unless keept as default OR model definition contains a num_ctx already that overrides default), which causes the model to be reloaded every time in title generation. This can lead to performance issues and inconvenience for users, as it requires the model to reload with new context length each time a new title is generated during chat initiation, and then subsequent another loading for continuing the chat.

Describe the solution you'd like
I propose enhancing the generate_openai_chat_completion function to maintain context length during title generation when starting a chat. The function should check if the current model has been used before and, if this model is also defined to be used for title generation, use the previously stored context length to avoid reloading the model in ollama backend. This can be achieved in a safe manner by ensuring that if the default num_ctx has not been changed via general settings, no override request is made; otherwise, the same context length is reused as defined there.

Describe alternatives you've considered
An alternative solution would involve modifying the generate_title function to include a separate parameter for the context length. However, this approach would require changes in other parts of the codebase, also it would also necessitate user intervention to manually edit the context length based on the backend's loaded num_ctx for the model.

Additional context
This feature request is relevant for Open WebUI users who frequently generate titles during chat initiation using LLMs. By addressing the current issue with maintaining context length, we can improve overall performance and user experience in this specific use case of using a model for chat and title generation in tandem.

chrisoutwright · 2024-06-12T20:51:22Z

The current workaround involves not defining the context length in Open WebUI via parameters (thus keeping it as default), but instead specifying it in the model's blob config file. However, this approach has a limitation: if the default context length is overridden via Open WebUI parameters, the context length specified in the config file will no longer be observed, and so the title generation will use size 2048 again compared to what has been defined higher in Open WebUI parameters.

boshk0 · 2024-06-19T10:26:11Z

I found a workaround! :) You need to create your own modelfile using the Admin -> Models and then use it for chat and Title generation:

Deepseek-v2-4096

FROM deepseek-v2:16b
SYSTEM """Answer in English"""
PARAMETER num_ctx 4096
PARAMETER temperature 0.0

derpyhue · 2024-07-04T22:49:50Z

It is also possible to disable Title Auto-Generation in the settings menu.
It will set your first prompt as the chat name.

chrisoutwright · 2024-07-07T20:46:54Z

I found a workaround! :) You need to create your own modelfile using the Admin -> Models and then use it for chat and Title generation:

Deepseek-v2-4096

FROM deepseek-v2:16b SYSTEM """Answer in English""" PARAMETER num_ctx 4096 PARAMETER temperature 0.0

Did not work for me, It only works when num_ctx is set in original models blob file. For a derived one it did still use 2000.

jessalfredsen · 2024-09-04T08:48:27Z

I found a workaround! :) You need to create your own modelfile using the Admin -> Models and then use it for chat and Title generation:

Deepseek-v2-4096

FROM deepseek-v2:16b SYSTEM """Answer in English""" PARAMETER num_ctx 4096 PARAMETER temperature 0.0

Worked for me.
I copied the original Llama3.1 model file from ollama, modified it, and used Admin -> Models to create a 'new' version of the original llama3.1.
Seems to have worked. The endless model reloading have stopped. Thx @boshk0

boshk0 mentioned this issue Jun 19, 2024

Feature request: Task Model to use the same LLM parameters as in Workspaces #3286

Closed

tjbck closed this as completed Sep 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: Maintain Context Length for Title Generation Using Same Model for Chat #3106

Feature Request: Maintain Context Length for Title Generation Using Same Model for Chat #3106

chrisoutwright commented Jun 12, 2024

chrisoutwright commented Jun 12, 2024

boshk0 commented Jun 19, 2024 •

edited

Loading

derpyhue commented Jul 4, 2024

chrisoutwright commented Jul 7, 2024

jessalfredsen commented Sep 4, 2024

Feature Request: Maintain Context Length for Title Generation Using Same Model for Chat #3106

Feature Request: Maintain Context Length for Title Generation Using Same Model for Chat #3106

Comments

chrisoutwright commented Jun 12, 2024

chrisoutwright commented Jun 12, 2024

boshk0 commented Jun 19, 2024 • edited Loading

derpyhue commented Jul 4, 2024

chrisoutwright commented Jul 7, 2024

jessalfredsen commented Sep 4, 2024

boshk0 commented Jun 19, 2024 •

edited

Loading