Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add switch to disable RAG on attachments. #3556

Closed
nickovs opened this issue Jun 30, 2024 · 2 comments
Closed

Add switch to disable RAG on attachments. #3556

nickovs opened this issue Jun 30, 2024 · 2 comments

Comments

@nickovs
Copy link
Contributor

nickovs commented Jun 30, 2024

Currently if you drop even a single document onto Open WebUI and ask a question about that document, it appears that RAG is always used. It would be helpful to be able to switch this off.

Currently, for each attachment the document text is extracted, the text is chunked, embeddings computed and indexed, an index lookup is performed and then some of the chunks are passed to the LLM. While this is an OK solution for retrieval tasks, it abjectly fails for summarisation tasks.

For LLMs with small context windows summarisation is hard, but many LLMs now support very large context windows; GPT-4o has a 128K context windows and Anthropic Claude 3 support 200K tokens or about 400 pages of text. To make best use of these longer-context models it would be helpful to be able to pass the entire extracted text of all attachments into the LLM. Having a switch to bypass the RAG steps would achieve this.

@tjbck
Copy link
Contributor

tjbck commented Jul 1, 2024

This behaviour has been changed since 0.3.6

@tjbck tjbck closed this as completed Jul 1, 2024
@nickovs
Copy link
Contributor Author

nickovs commented Jul 1, 2024

Can you describe the new behaviour? I am using 0.3.7 (commit 7bc88eb) and when I attach a multi-page document only a subset of 5 pages are being passed as context, not in the order of the original document. If I change the "Top K" parameter to 10 then I get 10 pages passed in, again not in the order of the original document. Clearly there is something still preventing more than the "top k" hits from being used and the re-ordering suggests that they are coming from some sort of lookup result.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants