Getting Started with dbt

This guide will walk you through setting up dbt and running your first transformation.

Prerequisites

Before you start, make sure you have:

Python 3.7+ installed
Access to a data warehouse (Snowflake, BigQuery, Redshift, Databricks, Postgres, etc.)
Database credentials with permissions to create schemas and tables
Git installed (recommended for version control)
Basic SQL knowledge

Installation

Option 1: Install dbt Core (Command Line)

Install dbt for your specific warehouse adapter:

Option 2: Use dbt Cloud (Web-based IDE)

dbt Cloud provides a hosted environment with a web IDE, scheduler, and more:

Sign up at getdbt.com/signup
Free tier available for individual developers
No local installation required

Initialize Your First Project

1. Create a New dbt Project

You'll be prompted to:

Choose your database adapter
Enter connection details

This creates a project structure:

2. Configure Your Database Connection

Edit ~/.dbt/profiles.yml (created during init):

Security Tip: Use environment variables for sensitive credentials:

3. Test Your Connection

You should see:

Run Your First Transformation

1. Examine the Example Models

dbt creates example models in models/example/:

models/example/my_first_dbt_model.sql:

models/example/my_second_dbt_model.sql:

2. Run the Models

Output:

dbt just created two tables/views in your warehouse!

3. Test Your Data

This runs the tests defined in models/example/schema.yml:

4. Generate Documentation

This opens a searchable website with your project documentation at http://localhost:8080.

Create Your First Real Model

Let's create a simple customer aggregation model.

1. Define a Source

Create models/sources.yml:

2. Create a Transformation Model

Create models/customer_orders.sql:

3. Add Tests

Create models/customer_orders.yml:

4. Run and Test

Essential dbt Commands

Command	Description
`dbt run`	Execute all models (create tables/views)
`dbt run --select model_name`	Run a specific model
`dbt run --select model_name+`	Run a model and all downstream models
`dbt run --select +model_name`	Run a model and all upstream models
`dbt test`	Run all tests
`dbt build`	Run models and tests together
`dbt docs generate`	Generate documentation
`dbt docs serve`	Serve documentation site
`dbt debug`	Test database connection
`dbt compile`	Compile Jinja/macros without running
`dbt seed`	Load CSV files from seeds/
`dbt snapshot`	Run snapshot models
`dbt clean`	Delete compiled files

Project Configuration (dbt_project.yml)

Key configurations in dbt_project.yml:

Common Materialization Types

Configure how dbt builds your models:

View (Default)

Fastest to build
Always up-to-date
Slower to query
No storage cost

Table

Slower to build
Fast to query
Uses storage
Stale until next run

Incremental

Only processes new/changed records
Fast builds for large datasets
Most complex to implement

Ephemeral

Not materialized in warehouse
Compiled as CTEs in downstream models
Saves storage but increases query complexity

Next Steps

Now that you have dbt running:

Organize Your Models - Learn folder structures in Best Practices
Explore Real Use Cases - See practical examples in Use Cases
Build Something Real - Follow Step-by-Step Tutorials
Join the Community - Ask questions in dbt Slack

Need Help?

Stuck on setup? Check the dbt docs
Want hands-on guidance? Explore my consulting services
Looking for team training? Let's talk about custom workshops

Getting Started with dbt

Getting Started with dbt

Prerequisites

Installation

Option 1: Install dbt Core (Command Line)

Option 2: Use dbt Cloud (Web-based IDE)

Initialize Your First Project

1. Create a New dbt Project

2. Configure Your Database Connection

3. Test Your Connection

Run Your First Transformation

1. Examine the Example Models

2. Run the Models

3. Test Your Data

4. Generate Documentation

Create Your First Real Model

1. Define a Source

2. Create a Transformation Model

3. Add Tests

4. Run and Test

Essential dbt Commands

Project Configuration (dbt_project.yml)

Common Materialization Types

View (Default)

Table

Incremental

Ephemeral

Next Steps

Need Help?

Stay in the loop

dbt (data build tool)

dbt Best Practices