DbtTutorial

Tutorial 2: Building a Customer Analytics Pipeline

*Time**: 45 minutes

8 min read

Tutorial 2: Building a Customer Analytics Pipeline

Time: 45 minutes Difficulty: Beginner to Intermediate Prerequisites: Tutorial 1


What You'll Build

A production-ready customer analytics pipeline with:

  • Multi-source data integration (CRM + Product + Payments)
  • Staging → Intermediate → Marts architecture
  • Customer segmentation logic
  • Cohort analysis
  • Revenue attribution
  • Comprehensive testing

What You'll Learn

  • How to organize models in layers
  • Building reusable intermediate models
  • Advanced SQL patterns (window functions, cohorts)
  • Testing strategies for business logic
  • Documentation best practices
  • Model selection syntax

Scenario

You're an analytics engineer at GrowthCo, a SaaS company. The business needs:

  1. Customer 360 view: Combine data from Salesforce, Stripe, and product usage
  2. Segmentation: Classify customers by behavior and value
  3. Cohort analysis: Track retention by signup month
  4. Revenue metrics: MRR, churn, expansion revenue

Step 1: Set Up the Project

Create New Project


Step 2: Load Sample Data

Create Seed Files

seeds/raw_crm_accounts.csv:

seeds/raw_subscriptions.csv:

seeds/raw_product_usage.csv:

seeds/raw_invoices.csv:

Load Seeds


Step 3: Define Sources

models/staging/_sources.yml:


Step 4: Build Staging Models

Staging: CRM Accounts

models/staging/salesforce/stg_salesforce__accounts.sql:

Staging: Subscriptions

models/staging/stripe/stg_stripe__subscriptions.sql:

Staging: Product Usage

models/staging/product_events/stg_product__usage.sql:

Staging: Invoices

models/staging/stripe/stg_stripe__invoices.sql:

Run Staging Models


Step 5: Build Intermediate Models

Intermediate: Customer Subscriptions

models/intermediate/customers/int_customer_subscriptions.sql:

Intermediate: Product Engagement Metrics

models/intermediate/customers/int_customer_engagement.sql:

Intermediate: Revenue Metrics

models/intermediate/revenue/int_subscription_revenue.sql:

Run Intermediate Models


Step 6: Build Mart Models

Mart: Customer Dimension

models/marts/customers/dim_customers.sql:

Mart: Monthly Cohort Analysis

models/marts/customers/fct_customer_cohorts.sql:

Mart: Monthly Revenue Summary

models/marts/revenue/fct_monthly_revenue.sql:

Run All Models


Step 7: Add Comprehensive Tests

models/staging/salesforce/stg_salesforce__models.yml:

models/marts/customers/customer_models.yml:

Run Tests


Step 8: Query Your Analytics

Connect and Query


Step 9: Update Project Configuration

dbt_project.yml:

Rebuild with New Config


Step 10: Advanced Model Selection


Challenge Exercises

Challenge 1: Add Churn Prediction Score

Create a churn_risk_score (0-100) in dim_customers based on:

  • Days since last activity (higher = more risk)
  • Total events (lower = more risk)
  • Features used (fewer = more risk)

Challenge 2: Create Weekly Active Users Model

Build fct_weekly_active_users showing unique accounts active each week.

Challenge 3: Add Data Quality Test

Write a custom test ensuring lifetime_value >= (mrr * invoices_paid).


What You've Learned

✅ Multi-layered dbt architecture (staging → intermediate → marts) ✅ Combining data from multiple sources ✅ Advanced SQL (window functions, cohorts, segmentation) ✅ Comprehensive testing strategies ✅ Model selection patterns ✅ Production-ready project structure


Next Steps

  • Tutorial 3: Incremental models and performance optimization
  • Tutorial 4: Advanced testing and data quality frameworks
  • Best Practices: Learn professional patterns

Need help implementing this in your organization? Get in touch for consulting or custom training.

Stay in the loop

Get weekly insights on data engineering, analytics, and AI—delivered straight to your inbox.

No spam. Unsubscribe anytime.