dbt Best Practices
This guide covers proven patterns and practices for building maintainable, scalable dbt projects. These recommendations come from real-world implementations across hundreds of organizations.
Project Structure & Organization
Standard Folder Layout
Naming Conventions
Model Prefixes
Examples
Double Underscore Convention
Use __ to separate source/subject from entity:
Model Organization Rules
Staging Layer
- Purpose: Clean and standardize raw sources
- Materialization: Views (lightweight, always fresh)
- Grain: 1:1 with source tables
- Transformations: Column renaming, type casting, basic cleaning
- No joins: Keep staging models simple
Intermediate Layer
- Purpose: Build reusable components
- Materialization: Views or ephemeral
- Grain: Can change from source
- Transformations: Joins, aggregations, window functions
- Not exposed to BI tools: Internal only
Marts Layer
- Purpose: Business-defined entities for analytics
- Materialization: Tables (fast queries)
- Grain: Defined by business need
- Transformations: Final business logic
- Exposed to BI tools: These are what analysts query
SQL Style Guide
General Principles
- Readability over cleverness: Explicit is better than implicit
- Consistency: Follow one style across the project
- Modularity: Break complex logic into CTEs
- Comments: Explain "why", not "what"
Formatting Rules
Specific Rules
SELECT statements
JOINs
CTEs (Common Table Expressions)
CASE statements
Testing Strategy
Layer-by-Layer Testing
Staging Models
Marts Models
Custom Tests
Testing Best Practices
- Test at every layer (staging, marts)
- Use generic tests for common patterns
- Write custom tests for business rules
- Set severity levels (
warnvserror) - Run tests in CI/CD pipelines
Performance Optimization
Materialization Strategy
| Layer | Materialization | Rationale |
|---|---|---|
| Staging | View | Lightweight, always fresh, rarely queried directly |
| Intermediate | View or Ephemeral | No storage cost, compiled into downstream models |
| Marts | Table | Fast query performance, worth the storage |
| Large Marts | Incremental | Only process new data, not full refresh |
Incremental Models
Partitioning (BigQuery)
Query Optimization Tips
- Filter early: Apply WHERE clauses in CTEs, not at the end
- **Avoid SELECT ***: Only select needed columns
- Use refs wisely: Don't ref() the same large model multiple times
- Leverage database features: Partitions, clusters, indexes
- Monitor costs: Use dbt artifacts to track run times and costs
Documentation Best Practices
YAML Documentation
Inline SQL Comments
Macros & DRY Principles
Reusable Logic with Macros
Cross-Database Compatibility
Version Control & CI/CD
Git Workflow
Pre-commit Checks
Slim CI (Only Test Changed Models)
Environment Management
profiles.yml Structure
Environment-Specific Config
Monitoring & Observability
Key Metrics to Track
- Run duration: Are models getting slower?
- Test failures: Data quality degrading?
- Row counts: Unexpected spikes or drops?
- Freshness: Is source data stale?
Source Freshness
Exposure Tracking
Common Anti-Patterns to Avoid
❌ Anti-Pattern 1: SELECT * in Production
❌ Anti-Pattern 2: Hard-Coded Values
❌ Anti-Pattern 3: Circular Dependencies
❌ Anti-Pattern 4: Unstable Sort Orders
❌ Anti-Pattern 5: Testing in Production
Quick Wins Checklist
- Adopt standard folder structure (staging/intermediate/marts)
- Use consistent naming conventions (stg_, int_, fct_, dim_)
- Add unique/not_null tests to all primary keys
- Document business-critical models
- Set up CI/CD to run dbt on pull requests
- Configure source freshness checks
- Use incremental models for large event tables
- Partition large tables by date
- Version control everything in Git
- Monitor dbt run performance over time
Learning Resources
Need Expert Guidance?
These practices come from years of real-world dbt implementations. Want help applying them to your project?
- Book a consultation for architecture review
- Custom training for your team
- Hands-on tutorials to practice these patterns