Skip to content

JRetza/nucliadb

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Contributor Covenant License: AGPL V3 X (formerly Twitter) Follow Rust Python codecov

Nuclia

The AI Search Database.

NucliaDB is a robust database that allows storing and searching on unstructured data.

It is an out of the box hybrid search database, utilizing vector, full text and graph indexes.

NucliaDB is written in Rust and Python. We designed it to index large datasets and provide multi-teanant support.

When utilizing NucliaDB with Nuclia cloud, you are able to the power of an NLP database without the hassle of data extraction, enrichment and inference. We do all the hard work for you.

Features

  • Store text, files, vectors, labels and annotations
  • Perform text searches and given a word or set of words, return resources in our database that contain them.
  • Perform semantic searches with vectors. For example, given a set of vectors, return the closest matches in our database. With NLP, this allows us to look for similar sentences without being constrained by exact keywords.
  • Export your data in a format compatible with most NLP pipelines (HuggingFace datasets, pytorch, etc)
  • Store original data, extracting and data pulled from the Understanding API
  • Index fields, paragraphs, and semantic sentences on index storage
  • Cloud data and insight extraction with the Nuclia Understanding API™
  • Cloud connection to train ML models with Nuclia Learning API™
  • Role based security system with upstream proxy authentication validation
  • Resources with multiple fields and metadata
  • Text/HTML/Markdown plain fields support
  • Field types: text, file, link, conversation
  • Storage layer (PostgreSQL)
  • Blob support with S3-compatible API, GCS and Azure Blob Storage
  • Replication of index storage
  • Distributed search
  • Cloud-native

Architecture

Architecture

Quickstart

Trying NucliaDB is super easy! You can extend your knowledge with the following readings:

💬 Community

🙋 FAQ

How is NucliaDB different from traditional search engines like Elasticsearch or Solr?

The core difference and advantage of NucliaDB is its architecture built from the ground up for unstructured data. Its vector index, keyword, graph and fuzzy search provide an API to use all extracted and extracted information from Nuclia, Understanding API and provides powerful NLP abilities to any application with low code and peace of mind.

What license does NucliaDB use?

NucliaDB is open-source under the GNU Affero General Public License Version 3 - AGPLv3. Fundamentally, this means that you are free to use NucliaDB for your project, as long as you don't modify NucliaDB. If you do, you have to make the modifications public.

What is Nuclia's business model?

Our business model relies on our normalization API, this one is based on Nuclia Learning API and Nuclia Understanding API. This two APIs offers transformation of unstructured data to NucliaDB compatible data with AI. We also offer NucliaDB as a service at our multi-cloud provider infrastructure: https://nuclia.cloud.

🤝 Contribute and spread the word

We are always happy to have contributions: code, documentation, issues, feedback, or even saying hello on Slack! Here is how you can get started:

✨ And to thank you for your contributions, claim your swag by emailing us at info at nuclia.com.

Reference

Meta

About

NucliaDB, The AI Search database for RAG

Resources

License

Code of conduct

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 71.6%
  • Rust 23.2%
  • PureBasic 4.7%
  • Makefile 0.2%
  • Smarty 0.1%
  • Shell 0.1%
  • Other 0.1%