Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RAG is only used on the first chat message #3674

Closed
3 of 4 tasks
aleixdorca opened this issue Jul 6, 2024 · 8 comments
Closed
3 of 4 tasks

RAG is only used on the first chat message #3674

aleixdorca opened this issue Jul 6, 2024 · 8 comments

Comments

@aleixdorca
Copy link
Contributor

Bug Report

Description

Bug Summary:
Open-Web UI only uses the "RAG" (Retrieval Augmented Generation) technique on the first message of the conversation. From the second message onwards, the response does not seem to be based on the previous conversation context.

Steps to Reproduce:
Start a new chat, upload a file and ask a question. Docker and Ollama log the normal RAG behaviour (with the proper RAG prompt). From the second message RAG is not used.

Expected Behavior:
RAG should be used on all questions, isn't it?

Environment

  • Open WebUI Version: 0.3.7

  • Ollama (if applicable): 0.1.48

  • Operating System: debian docker

  • Browser (if applicable): Chrome 126.0.6478.127

Reproduction Details

Confirmation:

  • I have read and followed all the instructions provided in the README.md.
  • I am on the latest version of both Open WebUI and Ollama.
  • I have included the browser console logs.
  • I have included the Docker container logs.

Logs and Screenshots

Docker Container Logs:

The docker logs show when the document is uploaded and embedded. For the first question RAG is shown in the Docker Logs as:

Use the following context as your learned knowledge, inside <context></context> XML tags.
<context>
REDACTED (but it is ok)
</context>

When answer to user:
- If you don't know, just say that you don't know.
- If you don't know when you are not sure, ask for clarification.
Avoid mentioning that you obtained the information from the context.
And answer according to the language of the user's question.

Given the context information, answer the query.
Query: My question

Ollama logs the query as well. But, on the second message, only ollama logs a basic message, no RAG is used at all.

Installation Method

The project was installed using Docker

@aleixdorca
Copy link
Contributor Author

I have also tried with different models (mistral, llama3, gemma2), just in case. Same behaviour.

@silentoplayz
Copy link
Collaborator

silentoplayz commented Jul 6, 2024

This is not a bug, but rather a deliberate change in how RAG handles uploaded documents. Since the introduction of the Knowledge feature, the default behavior has been updated. Now, uploaded documents are only considered within the context of a single message. To restore the previous functionality, you can enable uploaded documents or collections of documents as Knowledge for a model file in the Models section of the Workspace. This allows the model file to retain knowledge of the documents from the initial message onwards, eliminating the need to manually add documents to each query during a chat session with the model.

@aleixdorca
Copy link
Contributor Author

Thanks for answering and closing the bug report.

I don't get it, though. The way you put it means (please correct me if I am wrong):

  • Only administrators can upload documents to the Knowledge model section.
  • Regular users have no access to the Workspace so this is somewhat limited to some specific users.
  • Whenever a users wants to chat with a document or webpage (same thing happens with # elements) they have to keep reuploading the elements (even if these are not reembedded, this I understand). This is unbearingly hard.
  • If, for whatever reason, they forget to reupload the document or webpage, the answer is completely hallucinated.

This breaks a major feature of Open WebUI, in my opinion.

To add to this, the setup we are testing at our university gives access to 50 users, none with admin rights. Admins should add the company's information, this I get, but for casual documents, users should have more control and access to the RAG feature.

@silentoplayz
Copy link
Collaborator

silentoplayz commented Jul 6, 2024

I understand your concerns and appreciate you breaking down the limitations of the current implementation of RAG within Open WebUI.

You are correct that:

  • Only administrators can upload documents to the Knowledge model section. This isn't inherently a new issue, as both the Documents and Models sections has always been limited to only administrative configuration.
  • Regular users don't have access to the Workspace, which means they can't manage documents or the recent addition of model file knowledge, which may seem even more restrictive.

In addition to these existing limitations, the recent change to RAG's handling of uploaded documents has introduced new challenges. You're right that:

  • Users need to reupload documents or link a URL for each chat session, which can be inconvenient.
  • Forgetting to reupload the document or link a URL can lead to hallucinated answers with a weaker model, which can be frustrating for users and undermine the trust they have in the RAG system.

Speaking for many, I acknowledge that this change may have taken a hit to a major feature of Open WebUI in the perspective of some users, and we should revisit the design to make it more user-friendly and accessible.

With this all having been said, I will mention that the Open WebUI team is aware of the need for a more flexible solution that allows users to manage their own documents without relying on administrators. This is an area that is actively being worked on to be improved in the future, and we're excited to introduce "teams" in an upcoming feature. Related - #2924

@aleixdorca
Copy link
Contributor Author

It's great to hear that you understand the concerns regarding the recent changes to RAG in Open WebUI.

You've accurately outlined the issues, including the limitations for regular users, the inconvenience of reuploading documents per session, and the potential for inaccurate responses due to missing document links.

It's reassuring to know that the Open WebUI team is aware of these challenges and is actively working on a solution. The introduction of "teams" in an upcoming feature seems promising and could address many of the current limitations.

I appreciate your constructive feedback and your willingness to engage in this discussion.

I will keep an eye on future updates.

@Qualzz
Copy link

Qualzz commented Jul 11, 2024

It's very difficult to have a chat over a document, as the LLM doesn't create it's own query.
Thus you need to write every keyword in every message.

Exemple:
User: Can you retrieve the frame data for whatever here:
AI: Here is the data
User: Can you also display the images as markdown ?
AI: I don't fucking know what you're talking about -> Because the query will be "Can you also display the images as markdown ?"

@silentoplayz
Copy link
Collaborator

It's very difficult to have a chat over a document, as the LLM doesn't create it's own query. Thus you need to write every keyword in every message.

Related: #3516 (reply in thread)

@flyfox666
Copy link

Finally found this issue to solve my doubts haha.I'm waiting for the latest version to be updated.In fact, I still hope that ordinary users can have their own workspace, while the administrator can have a supervision, so that the company's internal BU department to deploy more quickly and easily!

Really Appreciated

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants