Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LangChain Box Data Loader #864

Open
chuckie82 opened this issue Jan 9, 2024 · 5 comments
Open

LangChain Box Data Loader #864

chuckie82 opened this issue Jan 9, 2024 · 5 comments

Comments

@chuckie82
Copy link

Is your feature request related to a problem? Please describe.

LLMs (Large Language Models) with RAG (Retrieval Augmented Generation) access user's own data to generate answers grounded on facts in the user's documents. It would be great to have LangChain data loading interface to Box, so that people can build generative AI applications that can query Box data easily.

Describe the solution you'd like

LangChain offers custom data loader class for this purpose. It would be great if the Box dev team look into this. List of document types/databases with custom data loader is here:
https://python.langchain.com/docs/integrations/document_loaders/

Describe alternatives you've considered

Alternatives would be porting Box data locally or copy data to another service that has a LangChain data loader interface.

@mwwoda
Copy link
Contributor

mwwoda commented Jan 12, 2024

Thanks for submitting this issue. We currently don't have any plans to work on this, but PRs are always welcome. Also, please consider using our new generation python sdk that we are focusing on now https://github.com/box/box-python-sdk-gen.

@chuckie82
Copy link
Author

Thanks for looking into this @mwwoda
I will look into LangChain Box data loader myself to determine how much work it will be for me. However, given that Box competitors are offering LangChain data loaders, I hope the Box dev team consider this feature in the near future.

Thanks for bringing box-python-sdk-gen to my attention.

@iokinpardo
Copy link

Thanks for looking into this @mwwoda I will look into LangChain Box data loader myself to determine how much work it will be for me. However, given that Box competitors are offering LangChain data loaders, I hope the Box dev team consider this feature in the near future.

Thanks for bringing box-python-sdk-gen to my attention.

@chuckie82 Hi, I am also super interested in LangChain Box data loader for JS version instead of Python one. Were you abe to progress in this direction?

Many thanks in advance.

@shurrey
Copy link

shurrey commented Sep 12, 2024

Hi @chuckie82 and @iokinpardo , we recently released langchain-box for Python. It is a work in progress, but we currently provide a Loader to get file(s) or all files in a folder (optionally recursive). We are also live with a Retriever that allows you to search Box for files based on a full-text search query and if you have access to Box AI, you can ask a question across files and return the answer as a Document. We will soon offer search options to help you narrow the scope of the search (PR pending) and the ability to ask Box AI, but return the citations rather than the answer, allowing you to get pertinent parts of the documents to supplement your AI workflow with your Box content. Everything in these first few releases will work only with files that have a text representation, and we are focusing on Python to start. Next few releases with include a Box Metadata Query search capability, and BlobLoaders, to help with those files that don't have a text representation or those use cases where you want the Blob instead of the text. We'll tackle langchainJS once the Python library is complete.

@iokinpardo
Copy link

Hi @chuckie82 and @iokinpardo , we recently released langchain-box for Python. It is a work in progress, but we currently provide a Loader to get file(s) or all files in a folder (optionally recursive). We are also live with a Retriever that allows you to search Box for files based on a full-text search query and if you have access to Box AI, you can ask a question across files and return the answer as a Document. We will soon offer search options to help you narrow the scope of the search (PR pending) and the ability to ask Box AI, but return the citations rather than the answer, allowing you to get pertinent parts of the documents to supplement your AI workflow with your Box content. Everything in these first few releases will work only with files that have a text representation, and we are focusing on Python to start. Next few releases with include a Box Metadata Query search capability, and BlobLoaders, to help with those files that don't have a text representation or those use cases where you want the Blob instead of the text. We'll tackle langchainJS once the Python library is complete.

Wouuu!!! This seems super promising @shurrey.
We are super interested in Box Langchain JS version as we run all our AI workflows on top of Flowise AI, that is based on Langachain JS. So we will be anxious to see how the Python integration turns out, so that we can opt for the later JS version. We consider this as key factor for our business.

Many thanks for the full work done on this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants