RAG is only used on the first chat message #3674

aleixdorca · 2024-07-06T11:14:47Z

Bug Report

Description

Bug Summary:
Open-Web UI only uses the "RAG" (Retrieval Augmented Generation) technique on the first message of the conversation. From the second message onwards, the response does not seem to be based on the previous conversation context.

Steps to Reproduce:
Start a new chat, upload a file and ask a question. Docker and Ollama log the normal RAG behaviour (with the proper RAG prompt). From the second message RAG is not used.

Expected Behavior:
RAG should be used on all questions, isn't it?

Environment

Open WebUI Version: 0.3.7
Ollama (if applicable): 0.1.48
Operating System: debian docker
Browser (if applicable): Chrome 126.0.6478.127

Reproduction Details

Confirmation:

I have read and followed all the instructions provided in the README.md.
I am on the latest version of both Open WebUI and Ollama.
I have included the browser console logs.
I have included the Docker container logs.

Logs and Screenshots

Docker Container Logs:

The docker logs show when the document is uploaded and embedded. For the first question RAG is shown in the Docker Logs as:

Use the following context as your learned knowledge, inside <context></context> XML tags.
<context>
REDACTED (but it is ok)
</context>

When answer to user:
- If you don't know, just say that you don't know.
- If you don't know when you are not sure, ask for clarification.
Avoid mentioning that you obtained the information from the context.
And answer according to the language of the user's question.

Given the context information, answer the query.
Query: My question

Ollama logs the query as well. But, on the second message, only ollama logs a basic message, no RAG is used at all.

Installation Method

The project was installed using Docker

The text was updated successfully, but these errors were encountered:

aleixdorca · 2024-07-06T11:48:03Z

I have also tried with different models (mistral, llama3, gemma2), just in case. Same behaviour.

silentoplayz · 2024-07-06T13:44:10Z

This is not a bug, but rather a deliberate change in how RAG handles uploaded documents. Since the introduction of the Knowledge feature, the default behavior has been updated. Now, uploaded documents are only considered within the context of a single message. To restore the previous functionality, you can enable uploaded documents or collections of documents as Knowledge for a model file in the Models section of the Workspace. This allows the model file to retain knowledge of the documents from the initial message onwards, eliminating the need to manually add documents to each query during a chat session with the model.

aleixdorca · 2024-07-06T14:03:33Z

Thanks for answering and closing the bug report.

I don't get it, though. The way you put it means (please correct me if I am wrong):

Only administrators can upload documents to the Knowledge model section.
Regular users have no access to the Workspace so this is somewhat limited to some specific users.
Whenever a users wants to chat with a document or webpage (same thing happens with # elements) they have to keep reuploading the elements (even if these are not reembedded, this I understand). This is unbearingly hard.
If, for whatever reason, they forget to reupload the document or webpage, the answer is completely hallucinated.

This breaks a major feature of Open WebUI, in my opinion.

To add to this, the setup we are testing at our university gives access to 50 users, none with admin rights. Admins should add the company's information, this I get, but for casual documents, users should have more control and access to the RAG feature.

silentoplayz · 2024-07-06T14:50:18Z

I understand your concerns and appreciate you breaking down the limitations of the current implementation of RAG within Open WebUI.

You are correct that:

Only administrators can upload documents to the Knowledge model section. This isn't inherently a new issue, as both the Documents and Models sections has always been limited to only administrative configuration.
Regular users don't have access to the Workspace, which means they can't manage documents or the recent addition of model file knowledge, which may seem even more restrictive.

In addition to these existing limitations, the recent change to RAG's handling of uploaded documents has introduced new challenges. You're right that:

Users need to reupload documents or link a URL for each chat session, which can be inconvenient.
Forgetting to reupload the document or link a URL can lead to hallucinated answers with a weaker model, which can be frustrating for users and undermine the trust they have in the RAG system.

Speaking for many, I acknowledge that this change may have taken a hit to a major feature of Open WebUI in the perspective of some users, and we should revisit the design to make it more user-friendly and accessible.

With this all having been said, I will mention that the Open WebUI team is aware of the need for a more flexible solution that allows users to manage their own documents without relying on administrators. This is an area that is actively being worked on to be improved in the future, and we're excited to introduce "teams" in an upcoming feature. Related - #2924

aleixdorca · 2024-07-06T14:59:16Z

It's great to hear that you understand the concerns regarding the recent changes to RAG in Open WebUI.

You've accurately outlined the issues, including the limitations for regular users, the inconvenience of reuploading documents per session, and the potential for inaccurate responses due to missing document links.

It's reassuring to know that the Open WebUI team is aware of these challenges and is actively working on a solution. The introduction of "teams" in an upcoming feature seems promising and could address many of the current limitations.

I appreciate your constructive feedback and your willingness to engage in this discussion.

I will keep an eye on future updates.

Qualzz · 2024-07-11T01:08:07Z

It's very difficult to have a chat over a document, as the LLM doesn't create it's own query.
Thus you need to write every keyword in every message.

Exemple:
User: Can you retrieve the frame data for whatever here:
AI: Here is the data
User: Can you also display the images as markdown ?
AI: I don't fucking know what you're talking about -> Because the query will be "Can you also display the images as markdown ?"

silentoplayz · 2024-07-11T06:34:26Z

It's very difficult to have a chat over a document, as the LLM doesn't create it's own query. Thus you need to write every keyword in every message.

Related: #3516 (reply in thread)

flyfox666 · 2024-07-13T12:16:56Z

Finally found this issue to solve my doubts haha.I'm waiting for the latest version to be updated.In fact, I still hope that ordinary users can have their own workspace, while the administrator can have a supervision, so that the company's internal BU department to deploy more quickly and easily!

Really Appreciated

silentoplayz closed this as completed Jul 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RAG is only used on the first chat message #3674

RAG is only used on the first chat message #3674

aleixdorca commented Jul 6, 2024

aleixdorca commented Jul 6, 2024

silentoplayz commented Jul 6, 2024 •

edited

Loading

aleixdorca commented Jul 6, 2024

silentoplayz commented Jul 6, 2024 •

edited

Loading

aleixdorca commented Jul 6, 2024

Qualzz commented Jul 11, 2024

silentoplayz commented Jul 11, 2024

flyfox666 commented Jul 13, 2024

RAG is only used on the first chat message #3674

RAG is only used on the first chat message #3674

Comments

aleixdorca commented Jul 6, 2024

Bug Report

Description

Environment

Reproduction Details

Logs and Screenshots

Installation Method

aleixdorca commented Jul 6, 2024

silentoplayz commented Jul 6, 2024 • edited Loading

aleixdorca commented Jul 6, 2024

silentoplayz commented Jul 6, 2024 • edited Loading

aleixdorca commented Jul 6, 2024

Qualzz commented Jul 11, 2024

silentoplayz commented Jul 11, 2024

flyfox666 commented Jul 13, 2024

silentoplayz commented Jul 6, 2024 •

edited

Loading

silentoplayz commented Jul 6, 2024 •

edited

Loading