DbtBest Practices

dbt Best Practices

This guide covers proven patterns and practices for building maintainable, scalable dbt projects. These recommendations come from real-world implementations across hundreds of organizations.

12 min read

dbt Best Practices

This guide covers proven patterns and practices for building maintainable, scalable dbt projects. These recommendations come from real-world implementations across hundreds of organizations.


Project Structure & Organization

Standard Folder Layout

Naming Conventions

Model Prefixes

Examples

Double Underscore Convention

Use __ to separate source/subject from entity:

Model Organization Rules

Staging Layer

  • Purpose: Clean and standardize raw sources
  • Materialization: Views (lightweight, always fresh)
  • Grain: 1:1 with source tables
  • Transformations: Column renaming, type casting, basic cleaning
  • No joins: Keep staging models simple

Intermediate Layer

  • Purpose: Build reusable components
  • Materialization: Views or ephemeral
  • Grain: Can change from source
  • Transformations: Joins, aggregations, window functions
  • Not exposed to BI tools: Internal only

Marts Layer

  • Purpose: Business-defined entities for analytics
  • Materialization: Tables (fast queries)
  • Grain: Defined by business need
  • Transformations: Final business logic
  • Exposed to BI tools: These are what analysts query

SQL Style Guide

General Principles

  • Readability over cleverness: Explicit is better than implicit
  • Consistency: Follow one style across the project
  • Modularity: Break complex logic into CTEs
  • Comments: Explain "why", not "what"

Formatting Rules

Specific Rules

SELECT statements

JOINs

CTEs (Common Table Expressions)

CASE statements


Testing Strategy

Layer-by-Layer Testing

Staging Models

Marts Models

Custom Tests

Testing Best Practices

  • Test at every layer (staging, marts)
  • Use generic tests for common patterns
  • Write custom tests for business rules
  • Set severity levels (warn vs error)
  • Run tests in CI/CD pipelines

Performance Optimization

Materialization Strategy

Layer Materialization Rationale
Staging View Lightweight, always fresh, rarely queried directly
Intermediate View or Ephemeral No storage cost, compiled into downstream models
Marts Table Fast query performance, worth the storage
Large Marts Incremental Only process new data, not full refresh

Incremental Models

Partitioning (BigQuery)

Query Optimization Tips

  1. Filter early: Apply WHERE clauses in CTEs, not at the end
  2. **Avoid SELECT ***: Only select needed columns
  3. Use refs wisely: Don't ref() the same large model multiple times
  4. Leverage database features: Partitions, clusters, indexes
  5. Monitor costs: Use dbt artifacts to track run times and costs

Documentation Best Practices

YAML Documentation

Inline SQL Comments


Macros & DRY Principles

Reusable Logic with Macros

Cross-Database Compatibility


Version Control & CI/CD

Git Workflow

Pre-commit Checks

Slim CI (Only Test Changed Models)


Environment Management

profiles.yml Structure

Environment-Specific Config


Monitoring & Observability

Key Metrics to Track

  1. Run duration: Are models getting slower?
  2. Test failures: Data quality degrading?
  3. Row counts: Unexpected spikes or drops?
  4. Freshness: Is source data stale?

Source Freshness

Exposure Tracking


Common Anti-Patterns to Avoid

❌ Anti-Pattern 1: SELECT * in Production

❌ Anti-Pattern 2: Hard-Coded Values

❌ Anti-Pattern 3: Circular Dependencies

❌ Anti-Pattern 4: Unstable Sort Orders

❌ Anti-Pattern 5: Testing in Production


Quick Wins Checklist

  • Adopt standard folder structure (staging/intermediate/marts)
  • Use consistent naming conventions (stg_, int_, fct_, dim_)
  • Add unique/not_null tests to all primary keys
  • Document business-critical models
  • Set up CI/CD to run dbt on pull requests
  • Configure source freshness checks
  • Use incremental models for large event tables
  • Partition large tables by date
  • Version control everything in Git
  • Monitor dbt run performance over time

Learning Resources


Need Expert Guidance?

These practices come from years of real-world dbt implementations. Want help applying them to your project?

Stay in the loop

Get weekly insights on data engineering, analytics, and AI—delivered straight to your inbox.

No spam. Unsubscribe anytime.