bigdata
OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team co…
Visually explore, understand, and present your data.
A data visualization and analytics component, especially well-suited for large and/or streaming datasets.
Use SQL to build ELT pipelines on a data lakehouse.
⚡ Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.io
Qbeast-spark: DataSource enabling multi-dimensional indexing and efficient data sampling. Big Data, free from the unnecessary!
re_data - fix data issues before your users & CEO would discover them 😊
An analytics database that puts JSON and relational tables on equal footing
A blazing fast tool for building data pipelines: read, process and output events. Our community: https://t.me/file_d_community
Querybook is a Big Data Querying UI, combining collocated table metadata and a simple notebook interface.
Dataplane is an Airflow inspired unified data platform with additional data mesh and RPA capability to automate, schedule and design data pipelines and workflows. Dataplane is written in Golang wit…
Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.
🦔 PostHog provides open-source web & product analytics, session recording, feature flagging and A/B testing that you can self-host. Get started - free.
Kuwala is the no-code data platform for BI analysts and engineers enabling you to build powerful analytics workflows. We are set out to bring state-of-the-art data engineering tools you love, such …
An orchestration platform for the development, production, and observation of data assets.
Enso Analytics is a self-service data prep and analysis platform designed for data teams.
🔮 Instill Core is a full-stack AI infrastructure tool for data, model and pipeline orchestration, designed to streamline every aspect of building versatile AI-first applications
🧙 Build, run, and manage data pipelines for integrating and transforming data.
YTsaurus is a scalable and fault-tolerant open-source big data platform.
Meltano: the declarative code-first data integration engine that powers your wildest data and ML-powered product ideas. Say goodbye to writing, maintaining, and scaling your own API integrations.
Efficient data transformation and modeling framework that is backwards compatible with dbt.
Dozer is a real-time data movement tool that leverages CDC from various sources and moves data into various sinks.
Build platforms that flexibly mix SQL, batch, and stream processing paradigms
MySQL replication topology management and HA
Self-serve BI to 10x your data team ⚡️
Data API Framework for AI Agents and Data Apps