Create a Customer Insights Pipeline with DBT and Airflow

🎯 Our Goal: Building a Production-Ready Data Pipeline

Welcome! In this notebook, we'll build a complete, production-ready ELT pipeline from scratch. Here’s a brief overview of our project:

The Dataset: We'll use the "Jaffle Shop," a fictional e-commerce store. Our raw data is split across three CSV files: raw_customers, raw_orders, and raw_payments. These tables are logically linked by shared id columns, which we'll use to join them, as shown in the schema diagram below.

The Tasks: We will build an end-to-end pipeline. This includes Loading the data (using dbt seed), Transforming it with a 3-layer dbt model (staging $\to$ intermediate $\to$ marts), Testing our models for data quality (like uniqueness and relationships), and finally, Orchestrating the entire process into an automated, scheduled job with Airflow.
The Audience: This pipeline is for any business that wants to answer the critical question, "Who are my most valuable customers?" Our final product will be a clean, reliable, and analytics-ready table (dim_customers) that a BI tool (like Tableau or Power BI) can connect to for analysis.

Follow the notebook

🏁 Project Complete!

We have successfully built and orchestrated a full data pipeline. The pipeline successfully created the final analytical table, and the output directly answers the core business question: "Who are our most valuable customers?".

What We Did:

Built Models (dbt): We used dbt to load seed data, run transformations (staging $\rightarrow$ intermediate $\rightarrow$ marts), and test our data quality.
Orchestrated Pipeline (Airflow): We wrote an Airflow DAG and used the airflow command to automatically run our entire dbt pipeline (seed, run, and test) in the correct, automated sequence.
The Answer: The final output is the single, reliable dim_customers table, which a BI tool (like Tableau or Power BI) could connect to for analysis.

This is the core workflow of a modern data pipeline!

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.gitignore		.gitignore
README.md		README.md
jaffle_shop_schema.png		jaffle_shop_schema.png
notebook.ipynb		notebook.ipynb
raw_customers.csv		raw_customers.csv
raw_orders.csv		raw_orders.csv
raw_payments.csv		raw_payments.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Create a Customer Insights Pipeline with DBT and Airflow

🎯 Our Goal: Building a Production-Ready Data Pipeline

🏁 Project Complete!

What We Did:

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Create a Customer Insights Pipeline with DBT and Airflow

🎯 Our Goal: Building a Production-Ready Data Pipeline

🏁 Project Complete!

What We Did:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages