Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Allow setting num_batch and make basic parameters like this available in the chat UI #3148

Closed
sammcj opened this issue Jun 14, 2024 · 4 comments

Comments

@sammcj
Copy link
Contributor

sammcj commented Jun 14, 2024

Is your feature request related to a problem? Please describe.

num_batch can greatly impact inference performance at the cost of more VRAM usage.

Depending on the task, it can be beneficial to sacrifice say some context size if it allows you to increase num_batch or vice versa.

e.g:

  • Fast responses: num_batch: 2048, num_ctx: 8192
  • Larger context, but a bit slower: num_batch: 512, num_ctx 32768
  • Middleground: num_batch: 1024, num_ctx: 16384

Describe the solution you'd like

It would be great if you could set num_batch in Open WebUI.

It would also be really useful if basic parameters such as num_ctx, num_batch, temp/top, num_keep etc... were available in the chat interface without having to go into settings -> advanced and tweak them there each time.

Describe alternatives you've considered

Right now I'm having to create several copies of models with the num_ctx/num_batch in their name in order to quickly switch between settings. It works, but it's painful.

@tjbck
Copy link
Contributor

tjbck commented Jun 14, 2024

Great suggestion, PR welcome!

@sammcj
Copy link
Contributor Author

sammcj commented Jun 14, 2024

PR incoming :)

@tjbck
Copy link
Contributor

tjbck commented Jun 14, 2024

@sammcj

oi-chad

@sammcj
Copy link
Contributor Author

sammcj commented Jun 14, 2024

@tjbck

image

@sammcj sammcj closed this as completed Jun 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants