You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
One document results in one collection in the vector database
The hash of the file content is the collection name
The meta data contains the filename
Deletion of documents keep data still in vector database
Questions:
What do you think of storing all documents in a single collection?
My opinion: A vector database is more efficient finding the k nearest vectors from one single collection than a python application fetching k nearest vectors from x collections, sorting them and taking first k vectors from result.
Filtering based on meta data can easily be implemented to limit a search based on the file hash.
Please correct me, if my assumption is incorrect.
Shouldn't the vectors get deleted in the vector store, once the document is deleted?
I can see the benefit of having it still in the store as if the file is scanned/uploaded again, it an just be reused.
But I see a problem, when for example we want to add or update meta data, like the file name.
The text was updated successfully, but these errors were encountered:
I want to raise the following discussion.
Status quo:
Questions:
What do you think of storing all documents in a single collection?
Filtering based on meta data can easily be implemented to limit a search based on the file hash.
Please correct me, if my assumption is incorrect.
Shouldn't the vectors get deleted in the vector store, once the document is deleted?
But I see a problem, when for example we want to add or update meta data, like the file name.
The text was updated successfully, but these errors were encountered: