Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: ollama server naming support #1785

Open
francip opened this issue Apr 26, 2024 · 2 comments
Open

feat: ollama server naming support #1785

francip opened this issue Apr 26, 2024 · 2 comments
Assignees

Comments

@francip
Copy link

francip commented Apr 26, 2024

Is your feature request related to a problem? Please describe.
When Open WebUI is configured with several connections to different Ollama servers running the same model (eg llama3:latest), it is impossible to determine which connection a model is running on in the model selection dropdown box on the chat. Similarly, it's impossible to know which connection particular response comes from.

Describe the solution you'd like
Allow nicknaming the connections (with pregenerated nickname for "local" connections, ie in the same docker container, or on the same machine"), and then show the nickname every time the model name is shown.

Describe alternatives you've considered
Alternatively, show the full connection string in the (i) tooltip.

Additional context
Here's an example of list of models from two different Ollama servers. There should be two entries for llama3:latest model from each connection. It is important to support this, because the two machines have wildly different capabilities (lapotop with Nvidia 3070 w/ 8G vs desktop with Nvidia 4090 w/ 24G)

Screenshot 2024-04-26 160632

@tjbck tjbck changed the title Support nicknaming connections to Ollama servers, and showing the nickname in the model selection dropdown feat: ollama server naming support Apr 27, 2024
@mjtechguy
Copy link

This please!

@befocken
Copy link

Is there already a plan for implementing this? I can envision two broad use cases: first, transparent load-balancing, as it currently stands (with the possible additional feature to see which connection the response actually came from). Secondly, when one has multiple, very different capable systems. For example, a local set of models that is always available and a remote set of models that is only sometimes available.

I believe both use cases are realistic, although the latter might occur more often in hobbyist settings.
The decision would, of course, influence how this is implemented then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants