This repo contains a hands-on ELT pipeline building project using Snowflake, dbt Core, and Airflow, with the aim of practicing the orchestration of dbt models using Cosmos.
Cosmos is an open-source package from Astronomer that simplifies running dbt Core projects as Airflow DAGs and Task Groups.
- Before Cosmos: Airflow runs your entire dbt project as a single task, making it difficult to investigate and troubleshoot if the task failed halfway through, and it requires a complete rerun from the beginning.
- With Cosmos: Cosmos turns each dbt model into a task group (model task + test task), providing better visibility into the runtime of your dbt project and making it easier to rerun from failure states.
Learning Resources:
- Orchestrate your dbt models with Astro, powered by Apache Airflow®
- astronomer.github.io/astronomer-cosmos
- github.com/astronomer/cosmos-demo
- Astronomer Academy - Astro: Onboarding
-- use an admin role
use role accountadmin;
-- create the DWH, DB, and role for transformation (dbt project)
create warehouse dbt_wh with warehouse_size='x-small';
create database dbt_db;
create role dbt_role;
-- grant permissions
grant usage on warehouse dbt_wh to role dbt_role;
grant all on database dbt_db to role dbt_role;
use role dbt_role;
create schema dbt_db.dbt_schema;
-- create the `dbt` user (will be used to set up dbt `profiles.yml`)
use role accountadmin;
create user if not exists dbt_airflow
password={password}
login_name='dbt_airflow'
must_change_password=FALSE
default_warehouse='dbt_wh'
default_role='dbt_role'
default_namespace='dbt_db.dbt_schema'
comment='dbt user used for data transformation';
grant role dbt_role to user dbt_airflow;-
Create a Python virtual environment using
virtualenvbrew install python@3.11 virtualenv virtualenv venv --python=python3.11 . venv/bin/activate -
Install dbt Core with
pippip install dbt-snowflake==1.7.1
-
Initialize the
data_pipelinedbt project by running thedbt initcommand and using your Snowflake account ID and thedbtuser created in Snowflake UI -
cdinto thedata_pipelinedbt project and rundbt debugto test connections -
Build the dbt assets into Snowflake
devschema:
Skipping the dbt modeling part here as my learning focus is deploying dbt Core with Airflow Cosmos
-
Install
Astro CLIand initialize adbt-dagprojectbrew install astro mkdir dbt-dag cd dbt-dag astro dev initNote: I've already installed Docker
-
Create a virtual environment in
Dockerfile:RUN python -m venv dbt_venv && source dbt_venv/bin/activate && \ pip install --no-cache-dir dbt-snowflake && deactivate -
Update
requirements.txtto installCosmos:astronomer-cosmos apache-airflow-providers-snowflake -
Move the
data_pipelinedbt project into thedags/dbt/directory (docs) -
Create a
.pyDAG file in the root of the DAGs directoryUseful Docs:
- Create a dagfile
- Profile Mapping - SnowflakeUserPassword
- Execution Modes
-
Start Airflow by running
astro dev start
-
Access Airflow UI to trigger the DAG
Acknowledgement: This project was created by following a code-along tutorial designed by @jayzern


