Tutorial 2: Building a Customer Analytics Pipeline
Time: 45 minutes Difficulty: Beginner to Intermediate Prerequisites: Tutorial 1
What You'll Build
A production-ready customer analytics pipeline with:
- Multi-source data integration (CRM + Product + Payments)
- Staging → Intermediate → Marts architecture
- Customer segmentation logic
- Cohort analysis
- Revenue attribution
- Comprehensive testing
What You'll Learn
- How to organize models in layers
- Building reusable intermediate models
- Advanced SQL patterns (window functions, cohorts)
- Testing strategies for business logic
- Documentation best practices
- Model selection syntax
Scenario
You're an analytics engineer at GrowthCo, a SaaS company. The business needs:
- Customer 360 view: Combine data from Salesforce, Stripe, and product usage
- Segmentation: Classify customers by behavior and value
- Cohort analysis: Track retention by signup month
- Revenue metrics: MRR, churn, expansion revenue
Step 1: Set Up the Project
Create New Project
Step 2: Load Sample Data
Create Seed Files
seeds/raw_crm_accounts.csv:
seeds/raw_subscriptions.csv:
seeds/raw_product_usage.csv:
seeds/raw_invoices.csv:
Load Seeds
Step 3: Define Sources
models/staging/_sources.yml:
Step 4: Build Staging Models
Staging: CRM Accounts
models/staging/salesforce/stg_salesforce__accounts.sql:
Staging: Subscriptions
models/staging/stripe/stg_stripe__subscriptions.sql:
Staging: Product Usage
models/staging/product_events/stg_product__usage.sql:
Staging: Invoices
models/staging/stripe/stg_stripe__invoices.sql:
Run Staging Models
Step 5: Build Intermediate Models
Intermediate: Customer Subscriptions
models/intermediate/customers/int_customer_subscriptions.sql:
Intermediate: Product Engagement Metrics
models/intermediate/customers/int_customer_engagement.sql:
Intermediate: Revenue Metrics
models/intermediate/revenue/int_subscription_revenue.sql:
Run Intermediate Models
Step 6: Build Mart Models
Mart: Customer Dimension
models/marts/customers/dim_customers.sql:
Mart: Monthly Cohort Analysis
models/marts/customers/fct_customer_cohorts.sql:
Mart: Monthly Revenue Summary
models/marts/revenue/fct_monthly_revenue.sql:
Run All Models
Step 7: Add Comprehensive Tests
models/staging/salesforce/stg_salesforce__models.yml:
models/marts/customers/customer_models.yml:
Run Tests
Step 8: Query Your Analytics
Connect and Query
Step 9: Update Project Configuration
dbt_project.yml:
Rebuild with New Config
Step 10: Advanced Model Selection
Challenge Exercises
Challenge 1: Add Churn Prediction Score
Create a churn_risk_score (0-100) in dim_customers based on:
- Days since last activity (higher = more risk)
- Total events (lower = more risk)
- Features used (fewer = more risk)
Challenge 2: Create Weekly Active Users Model
Build fct_weekly_active_users showing unique accounts active each week.
Challenge 3: Add Data Quality Test
Write a custom test ensuring lifetime_value >= (mrr * invoices_paid).
What You've Learned
✅ Multi-layered dbt architecture (staging → intermediate → marts) ✅ Combining data from multiple sources ✅ Advanced SQL (window functions, cohorts, segmentation) ✅ Comprehensive testing strategies ✅ Model selection patterns ✅ Production-ready project structure
Next Steps
- Tutorial 3: Incremental models and performance optimization
- Tutorial 4: Advanced testing and data quality frameworks
- Best Practices: Learn professional patterns
Need help implementing this in your organization? Get in touch for consulting or custom training.