Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enh: reset requested voice model after switching TTS Engine #3689

Open
4 tasks done
Ahmedsaed opened this issue Jul 6, 2024 · 7 comments
Open
4 tasks done

enh: reset requested voice model after switching TTS Engine #3689

Ahmedsaed opened this issue Jul 6, 2024 · 7 comments

Comments

@Ahmedsaed
Copy link

Bug Report

Description

Bug Summary:
Before setting up openedai-speech, I was using the webapi Google UK English Female voice. After I completed the setup for openedai-speech, I was getting these errors

Open webui logs

INFO:     192.168.1.2:0 - "POST /audio/api/v1/speech HTTP/1.1" 400 Bad Request
ERROR:apps.audio.main:400 Client Error: Bad Request for url: http://openedai-speech:8000/v1/audio/speech
Traceback (most recent call last):
  File "/app/backend/apps/audio/main.py", line 219, in speech
    r.raise_for_status()
  File "/usr/local/lib/python3.11/site-packages/requests/models.py", line 1024, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 400 Client Error: Bad Request for url: http://openedai-speech:8000/v1/audio/speech

Openedai-speech logs

INFO:     192.168.80.3:60494 - "POST /v1/audio/speech HTTP/1.1" 400 Bad Request
2024-07-06 21:26:38.527 | INFO     | openedai:openai_statuserror_handler:106 - BadRequestError(message="Error loading voice: Google UK English Female, KeyError: 'Google UK English Female'", code=400, param=voice)

The error message clearly shows that the requested model didn't get reset or overridden by the value chosen in Admin settings. I had to update the voice model from the user (settings button) settings.

I believe the admin settings and settings should be in sync at least for the admin user.

Steps to Reproduce:

  1. configure audio settings using webapi
  2. configure audio settings using any openai-compatible API
  3. Try to play any message and you will get the error

Expected Behavior:
When the engine changed through admin settings, users should have audio settings updated or reset.

Actual Behavior:
The voice model chosen in settings was being used despite it being invalid.

Environment

  • Open WebUI Version: v0.3.7

  • Operating System: Ubuntu 20.04

  • Browser (if applicable): Chrome latest

Reproduction Details

Confirmation:

  • I have read and followed all the instructions provided in the README.md.
  • I am on the latest version of both Open WebUI and Ollama.
  • I have included the browser console logs.
  • I have included the Docker container logs.

Installation Method

Both open webui and openedai-speech were installed through docker.

@Ahmedsaed
Copy link
Author

Ahmedsaed commented Jul 6, 2024

Also the settings page doesn't display the full list of available voices when a voice is already selected. I have to delete the selected voice for the complete list to show.

It feels more of an autocomplete than a drop down menu

@Ahmedsaed
Copy link
Author

I've noticed a bug related to caching.

To explain it, I need to understand where the voices list in the dropdown menu comes from. Specifically, I'm referring to the list that includes voices like alloy, echo, fable, etc..

I assumed this list was populated with the available voices, but when I added my custom voice to the openedai-speech config, the list did not include the new voice, even after restarting.

Interestingly, I can manually enter the name of the custom voice and it works, except when I try to play a message that has already been played with one of the predefined voices.

To elaborate, I initially used the alloy voice to play a message. After switching to my custom voice, the play button didn't work, despite the logs showing no errors.

This issue only occurs with custom models. If I select another voice from the predefined list, it plays normally.

Playing the custom voice on a new message, however, works without any problems.

@tjbck tjbck changed the title Reset requested voice model after switching TTS Engine enh: reset requested voice model after switching TTS Engine Jul 8, 2024
@tjbck
Copy link
Contributor

tjbck commented Jul 8, 2024

PR welcome!

@jason-e-gross
Copy link

jason-e-gross commented Jul 11, 2024

I can confirm that this is an issue. I stood up my own container of openedai-speech, and I can exec into Open WebUI's container, and curl to the speech contsainer, and get it to generate audio (so I know the endpoint is reachable from within Open WebUI) but Open WebUI keeps throwing 400 Bad Requests, when I watch open webui's docker logs.

Also has nothing to do with any custom voices - i can't get it to work with the default settings.

Docker logs from Open WebUI
image

Docker logs from OpenedAI-Speech
image

It looks like it's passing in some other setting from elsewhere? "Microsoft Zira"?

Oh, huh - I gotta go into my own personal settings also and change it. Odd.

@jason-e-gross
Copy link

Follow-up, tried a custom - and can't seem to get custom voices to work at all. only the standard ones included in openedai-speech.

@Ahmedsaed
Copy link
Author

Ahmedsaed commented Jul 11, 2024

Follow-up, tried a custom - and can't seem to get custom voices to work at all. only the standard ones included in openedai-speech.

Have you noticed the third message a wrote about a potential bug with caching and invalidation?

What exactly is the problem you are facing?

Have you tested the custom voice through a curl command?

If the model is working through other methods then the issue is probably related to the caching

Don't test the custom model on a message that had been played with the standard ones

@jason-e-gross
Copy link

Finally, for me - its a custom file that just won't work. But I don't know that it's open-web-ui's issue - because when I curl the file, opened-ai-speech generates the mp3, but it's corrupt - so im probably doing something wrong there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants