Skip to content

tonbo-io/tonbo

Repository files navigation

Tonbo

CI

Website | Rust Doc | Blog | Community

Introduction

Tonbo is an embedded persistent database built on Apache Arrow & Parquet. It offers essential KV-like methods—insert, filter, and range scan—to efficiently and conveniently query type-safe structured data. Tonbo is able to integrate seamlessly with other Arrow analytical tools, such as DataFusion. For an example, refer to this example. Official support for DataFusion will be included in the next release.

Example

use std::ops::Bound;

use futures_util::stream::StreamExt;
use tonbo::{executor::tokio::TokioExecutor, tonbo_record, Projection, DB};

/// Use macro to define schema of column family just like ORM
/// It provides type-safe read & write API
#[tonbo_record]
pub struct User {
    #[primary_key]
    name: String,
    email: Option<String>,
    age: u8,
}

#[tokio::main]
async fn main() {
    // pluggable async runtime and I/O
    let db = DB::new("./db_path/users".into(), TokioExecutor::default())
        .await
        .unwrap();

    // insert with owned value
    db.insert(User {
        name: "Alice".into(),
        email: Some("[email protected]".into()),
        age: 22,
    })
    .await
    .unwrap();

    {
        // tonbo supports transaction
        let txn = db.transaction().await;

        // get from primary key
        let name = "Alice".into();

        // get the zero-copy reference of record without any allocations.
        let user = txn
            .get(
                &name,
                // tonbo supports pushing down projection
                Projection::All,
            )
            .await
            .unwrap();
        assert!(user.is_some());
        assert_eq!(user.unwrap().get().age, Some(22));

        {
            let upper = "Blob".into();
            // range scan of
            let mut scan = txn
                .scan((Bound::Included(&name), Bound::Excluded(&upper)))
                .await
                // tonbo supports pushing down projection
                .projection(vec![1])
                .take()
                .await
                .unwrap();
            while let Some(entry) = scan.next().await.transpose().unwrap() {
                assert_eq!(
                    entry.value(),
                    Some(UserRef {
                        name: "Alice",
                        email: Some("[email protected]"),
                        age: Some(22),
                    })
                );
            }
        }

        // commit transaction
        txn.commit().await.unwrap();
    }
}

Features

  • Fully asynchronous API.
  • Zero-copy rusty API ensuring safety with compile-time type and lifetime checks.
  • Vendor-agnostic:
    • Various usage methods, async runtimes, and file systems:
    • Most lightweight implementation to Arrow / Parquet LSM Trees:
      • Define schema using just Arrow schema and store data in Parquet files.
      • (Optimistic) Transactions.
      • Leveled compaction strategy.
      • Push down filter, limit and projection.
  • Runtime schema definition (in next release).
  • SQL (via Apache DataFusion).
  • Fusion storage across RAM, flash, SSD, and remote Object Storage Service (OSS) for each column-family, balancing performance and cost efficiency per data block:
  • Blob storage (like BlobDB in RocksDB).

Contributing to Tonbo

Please feel free to ask any question or contact us on Github Discussions or issues.