dbt & Dagster#

Using Dagster ? Automatically load your dbt models as Dagster assets by importing an existing dbt project in Dagster

Dagster orchestrates dbt alongside other technologies, so you can schedule dbt with Spark, Python, etc. in a single data pipeline.

Dagster's asset definition approach allows Dagster to understand dbt at the level of individual dbt models. This means that you can:

Use Dagster's UI or APIs to run subsets of your dbt models, seeds, and snapshots.
Track failures, logs, and run history for individual dbt models, seeds, and snapshots.
Define dependencies between individual dbt models and other data assets. For example, put dbt models after the Fivetran-ingested table that they read from, or put a machine learning after the dbt models that it's trained from.

An asset graph like this:

Dagster graph with dbt, Fivetran, and TensorFlow

Can be produced from code like this:


from pathlib import Path

from dagster_dbt import DbtCliResource, dbt_assets, get_asset_key_for_model
from dagster_fivetran import build_fivetran_assets

from dagster import AssetExecutionContext, asset

fivetran_assets = build_fivetran_assets(
    connector_id="postgres",
    destination_tables=["users", "orders"],
)


@dbt_assets(manifest=Path("manifest.json"))
def dbt_project_assets(context: AssetExecutionContext, dbt: DbtCliResource):
    yield from dbt.cli(["build"], context=context).stream()


@asset(
    compute_kind="tensorflow",
    deps=[get_asset_key_for_model([dbt_project_assets], "daily_order_summary")],
)
def predicted_orders(): ...

Getting started#

There are a few ways to get started with Dagster and dbt:

Take the tutorial. We'll walk you through setting up dbt and Dagster together on your computer, using dbt's example jaffle shop project, the dagster-dbt library, and a data warehouse, such as DuckDB. By the end, you'll have a working dbt and Dagster project and a handful of materialized Dagster assets, including a chart powered by data from your dbt models.
Play around with a working dbt & Dagster project.
Browse the dagster-dbt integration reference for short lessons on dbt and Dagster topics.
Review the API docs for the dagster-dbt library.
Automatically load your dbt models as Dagster assets by importing an existing dbt project into Dagster .

Understanding how dbt models relate to Dagster asset definitions#

Dagster’s asset definitions bear several similarities to dbt models. An asset definition contains an asset key, a set of upstream asset keys, and an operation that is responsible for computing the asset from its upstream dependencies. Models defined in a dbt project can be interpreted as Dagster asset definitions:

The asset key for a dbt model is (by default) the name of the model.
The upstream dependencies of a dbt model are defined with ref or source calls within the model's definition.
The computation required to compute the asset from its upstream dependencies is the SQL within the model's definition.

These similarities make it natural to interact with dbt models as asset definitions. Let’s take a look at a dbt model and an asset definition, in code:

Comparison of a dbt model and Dagster asset in code

Here's what's happening in this example:

The first code block is a dbt model
- As dbt models are named using file names, this model is named orders
- The data for this model comes from a dependency named raw_orders
The second code block is a Dagster asset
- The asset key corresponds to the name of the dbt model, orders
- raw_orders is provided as an argument to the asset, defining it as a dependency

To learn how to load dbt models into Dagster as assets, check out the tutorial or the quick version in the dagster-dbt reference.