Getting Started with dbt
This guide will walk you through setting up dbt and running your first transformation.
Prerequisites
Before you start, make sure you have:
- Python 3.7+ installed
- Access to a data warehouse (Snowflake, BigQuery, Redshift, Databricks, Postgres, etc.)
- Database credentials with permissions to create schemas and tables
- Git installed (recommended for version control)
- Basic SQL knowledge
Installation
Option 1: Install dbt Core (Command Line)
Install dbt for your specific warehouse adapter:
Option 2: Use dbt Cloud (Web-based IDE)
dbt Cloud provides a hosted environment with a web IDE, scheduler, and more:
- Sign up at getdbt.com/signup
- Free tier available for individual developers
- No local installation required
Initialize Your First Project
1. Create a New dbt Project
You'll be prompted to:
- Choose your database adapter
- Enter connection details
This creates a project structure:
2. Configure Your Database Connection
Edit ~/.dbt/profiles.yml (created during init):
Security Tip: Use environment variables for sensitive credentials:
3. Test Your Connection
You should see:
Run Your First Transformation
1. Examine the Example Models
dbt creates example models in models/example/:
models/example/my_first_dbt_model.sql:
models/example/my_second_dbt_model.sql:
2. Run the Models
Output:
dbt just created two tables/views in your warehouse!
3. Test Your Data
This runs the tests defined in models/example/schema.yml:
4. Generate Documentation
This opens a searchable website with your project documentation at http://localhost:8080.
Create Your First Real Model
Let's create a simple customer aggregation model.
1. Define a Source
Create models/sources.yml:
2. Create a Transformation Model
Create models/customer_orders.sql:
3. Add Tests
Create models/customer_orders.yml:
4. Run and Test
Essential dbt Commands
| Command | Description |
|---|---|
dbt run |
Execute all models (create tables/views) |
dbt run --select model_name |
Run a specific model |
dbt run --select model_name+ |
Run a model and all downstream models |
dbt run --select +model_name |
Run a model and all upstream models |
dbt test |
Run all tests |
dbt build |
Run models and tests together |
dbt docs generate |
Generate documentation |
dbt docs serve |
Serve documentation site |
dbt debug |
Test database connection |
dbt compile |
Compile Jinja/macros without running |
dbt seed |
Load CSV files from seeds/ |
dbt snapshot |
Run snapshot models |
dbt clean |
Delete compiled files |
Project Configuration (dbt_project.yml)
Key configurations in dbt_project.yml:
Common Materialization Types
Configure how dbt builds your models:
View (Default)
- Fastest to build
- Always up-to-date
- Slower to query
- No storage cost
Table
- Slower to build
- Fast to query
- Uses storage
- Stale until next run
Incremental
- Only processes new/changed records
- Fast builds for large datasets
- Most complex to implement
Ephemeral
- Not materialized in warehouse
- Compiled as CTEs in downstream models
- Saves storage but increases query complexity
Next Steps
Now that you have dbt running:
- Organize Your Models - Learn folder structures in Best Practices
- Explore Real Use Cases - See practical examples in Use Cases
- Build Something Real - Follow Step-by-Step Tutorials
- Join the Community - Ask questions in dbt Slack
Need Help?
- Stuck on setup? Check the dbt docs
- Want hands-on guidance? Explore my consulting services
- Looking for team training? Let's talk about custom workshops