Integrating FAISS into OpenWebUI #3339

terodactilo21 · 2024-06-20T21:36:12Z

Is your feature request related to a problem? Please describe.
The current RAG implementation in OpenWebUI struggles with large PDFs (hundreds of pages) and extensive document collections. It's limited by token constraints and doesn't utilize full document content effectively. Attempts to integrate alternatives like nomic-embed have led to errors and system instability.
Moreover, the system isn't user-friendly for non-technical users who need to work with large documents but don't understand concepts like token limits.
Integrating FAISS could solve these issues by providing efficient vector search capabilities that work independently of the LLM model used and across different languages. This would allow for more accurate, context-aware responses from large document collections, regardless of the chosen LLM or the language of the documents, significantly enhancing OpenWebUI's functionality and accessibility.

Describe the solution you'd like
I propose integrating FAISS (Facebook AI Similarity Search) into OpenWebUI to enable efficient vector search capabilities. This integration should include:

A document processing pipeline that automatically converts uploaded PDFs and text documents into embeddings.
A FAISS index to store and efficiently search these embeddings.
An option in the UI to enable "Knowledge Base Search" when querying models.
Integration of the FAISS search results with the language model prompts to provide context-aware responses.
An admin interface to manage the document collection and FAISS index.

This feature would allow users to seamlessly use their personal document collections as a knowledge base for more accurate and context-aware interactions with language models.

Describe alternatives you've considered
Using simpler vector stores like Annoy or Hnswlib, but FAISS offers superior performance and GPU acceleration.
Implementing a separate RAG service that communicates with OpenWebUI, but this would be less user-friendly and harder to maintain.
Using external vector database services, but this would compromise the self-hosted nature of OpenWebUI.

Additional context
This feature would be particularly useful for users who need to work with domain-specific knowledge or large personal document collections. It would significantly enhance the capabilities of OpenWebUI, making it a more comprehensive solution for AI-assisted information retrieval and generation. The integration should be designed to work efficiently with consumer-grade hardware (e.g., a single GPU) to maintain accessibility for personal and small business users.

justinh-rahb · 2024-06-20T21:39:29Z

This should be possible to implement as either a Filter Function within WebUI (in dev right now), or as a Pipeline in the Pipeline server project.

open-webui locked and limited conversation to collaborators Jun 20, 2024

tjbck converted this issue into discussion #3340 Jun 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

This issue was moved to a discussion.

Integrating FAISS into OpenWebUI #3339

Integrating FAISS into OpenWebUI #3339

terodactilo21 commented Jun 20, 2024

justinh-rahb commented Jun 20, 2024

This issue was moved to a discussion.

This issue was moved to a discussion.

Integrating FAISS into OpenWebUI #3339

Integrating FAISS into OpenWebUI #3339

Comments

terodactilo21 commented Jun 20, 2024

justinh-rahb commented Jun 20, 2024

This issue was moved to a discussion.