Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: YouTube RAG fails when subtitles are not available #3435

Closed
4 tasks
bannert1337 opened this issue Jun 25, 2024 · 1 comment
Closed
4 tasks

bug: YouTube RAG fails when subtitles are not available #3435

bannert1337 opened this issue Jun 25, 2024 · 1 comment

Comments

@bannert1337
Copy link
Contributor

bannert1337 commented Jun 25, 2024

Bug Report

Description

Bug Summary:
When referencing YouTube videos with subtitles not available, an error occurs, and the YouTube video cannot be added as a reference.

Steps to Reproduce:

  1. Attempt to reference a YouTube video without available subtitles using the YouTube RAG.
  2. Observe the error message: "Something went wrong :/ Could not retrieve a transcript for the video."

Expected Behavior:
The system should gracefully handle videos without available subtitles and provide a more informative error message or alternative behavior.

Actual Behavior:
The system fails and outputs an error message, preventing the YouTube video from being referenced.

Environment

  • Open WebUI Version: [Please specify]

  • Ollama (if applicable): [Please specify]

  • Operating System: [Please specify]

  • Browser (if applicable): [Please specify]

Reproduction Details

Confirmation:

  • I have read and followed all the instructions provided in the README.md.
  • I am on the latest version of both Open WebUI and Ollama.
  • I have included the browser console logs.
  • I have included the Docker container logs.

Logs and Screenshots

Browser Console Logs:
[Include relevant browser console logs, if applicable]

Docker Container Logs:
[Include relevant Docker container logs, if applicable]

Screenshots (if applicable):
[Attach any relevant screenshots to help illustrate the issue]

Installation Method

[Describe the method you used to install the project, e.g., manual installation, Docker, package manager, etc.]

Additional Information

The error message received is as follows:

Something went wrong :/
Could not retrieve a transcript for the video https://www.youtube.com/watch?v=6nuPVOKfq18!
This is most likely caused by: No transcripts were found for any of the requested language codes: ['en']
For this video (6nuPVOKfq18) transcripts are available in the following languages:
(MANUALLY CREATED) None 
(GENERATED) - de ("German (auto-generated)")
[TRANSLATABLE] (TRANSLATION LANGUAGES) - af ("Afrikaans") - ak ("Akan") - sq ("Albanian") - am ("Amharic") - ar ("Arabic") - hy ("Armenian") - as ("Assamese") - ay ("Aymara") - az ("Azerbaijani") - bn ("Bangla") - eu ("Basque") - be ("Belarusian") - bho ("Bhojpuri") - bs ("Bosnian") - bg ("Bulgarian") - my ("Burmese") - ca ("Catalan") - ceb ("Cebuano") - zh-Hans ("Chinese (Simplified)") - zh-Hant ("Chinese (Traditional)") - co ("Corsican") - hr ("Croatian") - cs ("Czech") - da ("Danish") - dv ("Divehi") - nl ("Dutch") - en ("English") - eo ("Esperanto") - et ("Estonian") - ee ("Ewe") - fil ("Filipino") - fi ("Finnish") - fr ("French") - gl ("Galician") - lg ("Ganda") - ka ("Georgian") - de ("German") - el ("Greek") - gn ("Guarani") - gu ("Gujarati") - ht ("Haitian Creole") - ha ("Hausa") - haw ("Hawaiian") - iw ("Hebrew") - hi ("Hindi") - hmn ("Hmong") - hu ("Hungarian") - is ("Icelandic") - ig ("Igbo") - id ("Indonesian") - ga ("Irish") - it ("Italian") - ja ("Japanese") - jv ("Javanese") - kn ("Kannada") - kk ("Kazakh") - km ("Khmer") - rw ("Kinyarwanda") - ko ("Korean") - kri ("Krio") - ku ("Kurdish") - ky ("Kyrgyz") - lo ("Lao") - la ("Latin") - lv ("Latvian") - ln ("Lingala") - lt ("Lithuanian") - lb ("Luxembourgish") - mk ("Macedonian") - mg ("Malagasy") - ms ("Malay") - ml ("Malayalam") - mt ("Maltese") - mi ("Māori") - mr ("Marathi") - mn ("Mongolian") - ne ("Nepali") - nso ("Northern Sotho") - no ("Norwegian") - ny ("Nyanja") - or ("Odia") - om ("Oromo") - ps ("Pashto") - fa ("Persian") - pl ("Polish") - pt ("Portuguese") - pa ("Punjabi") - qu ("Quechua") - ro ("Romanian") - ru ("Russian") - sm ("Samoan") - sa ("Sanskrit") - gd ("Scottish Gaelic") - sr ("Serbian") - sn ("Shona") - sd ("Sindhi") - si ("Sinhala") - sk ("Slovak") - sl ("Slovenian") - so ("Somali") - st ("Southern Sotho") - es ("Spanish") - su ("Sundanese") - sw ("Swahili") - sv ("Swedish") - tg ("Tajik") - ta ("Tamil") - tt ("Tatar") - te ("Telugu") - th ("Thai") - ti ("Tigrinya") - ts ("Tsonga") - tr ("Turkish") - tk ("Turkmen") - uk ("Ukrainian") - ur ("Urdu") - ug ("Uyghur") - uz ("Uzbek") - vi ("Vietnamese") - cy ("Welsh") - fy ("Western Frisian") - xh ("Xhosa") - yi ("Yiddish") - yo ("Yoruba") - zu ("Zulu")
If you are sure that the described cause is not responsible for this error and that a transcript should be retrievable, please create an issue at https://github.com/jdepoix/youtube-transcript-api/issues.
Please add which version of youtube_transcript_api you are using and provide the information needed to replicate the error.
Also make sure that there are no open issues which already describe your problem!

This issue suggests the need for handling cases where transcripts are not available, ensuring better user experience.

Note

If the bug report is incomplete or does not follow the provided instructions, it may not be addressed. Please ensure that you have followed the steps outlined in the README.md and troubleshooting.md documents, and provide all necessary information for us to reproduce and address the issue. Thank you!

@justinh-rahb
Copy link
Collaborator

Behaving as expected. YouTube RAG needs a transcript to work... Not sure how it's expected to resolve this.

@justinh-rahb justinh-rahb converted this issue into discussion #3446 Jun 25, 2024

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants