DbtGetting Started

Getting Started with dbt

This guide will walk you through setting up dbt and running your first transformation.

6 min read

Getting Started with dbt

This guide will walk you through setting up dbt and running your first transformation.

Prerequisites

Before you start, make sure you have:

  • Python 3.7+ installed
  • Access to a data warehouse (Snowflake, BigQuery, Redshift, Databricks, Postgres, etc.)
  • Database credentials with permissions to create schemas and tables
  • Git installed (recommended for version control)
  • Basic SQL knowledge

Installation

Option 1: Install dbt Core (Command Line)

Install dbt for your specific warehouse adapter:

Option 2: Use dbt Cloud (Web-based IDE)

dbt Cloud provides a hosted environment with a web IDE, scheduler, and more:

  1. Sign up at getdbt.com/signup
  2. Free tier available for individual developers
  3. No local installation required

Initialize Your First Project

1. Create a New dbt Project

You'll be prompted to:

  • Choose your database adapter
  • Enter connection details

This creates a project structure:

2. Configure Your Database Connection

Edit ~/.dbt/profiles.yml (created during init):

Security Tip: Use environment variables for sensitive credentials:

3. Test Your Connection

You should see:


Run Your First Transformation

1. Examine the Example Models

dbt creates example models in models/example/:

models/example/my_first_dbt_model.sql:

models/example/my_second_dbt_model.sql:

2. Run the Models

Output:

dbt just created two tables/views in your warehouse!

3. Test Your Data

This runs the tests defined in models/example/schema.yml:

4. Generate Documentation

This opens a searchable website with your project documentation at http://localhost:8080.


Create Your First Real Model

Let's create a simple customer aggregation model.

1. Define a Source

Create models/sources.yml:

2. Create a Transformation Model

Create models/customer_orders.sql:

3. Add Tests

Create models/customer_orders.yml:

4. Run and Test


Essential dbt Commands

Command Description
dbt run Execute all models (create tables/views)
dbt run --select model_name Run a specific model
dbt run --select model_name+ Run a model and all downstream models
dbt run --select +model_name Run a model and all upstream models
dbt test Run all tests
dbt build Run models and tests together
dbt docs generate Generate documentation
dbt docs serve Serve documentation site
dbt debug Test database connection
dbt compile Compile Jinja/macros without running
dbt seed Load CSV files from seeds/
dbt snapshot Run snapshot models
dbt clean Delete compiled files

Project Configuration (dbt_project.yml)

Key configurations in dbt_project.yml:


Common Materialization Types

Configure how dbt builds your models:

View (Default)

  • Fastest to build
  • Always up-to-date
  • Slower to query
  • No storage cost

Table

  • Slower to build
  • Fast to query
  • Uses storage
  • Stale until next run

Incremental

  • Only processes new/changed records
  • Fast builds for large datasets
  • Most complex to implement

Ephemeral

  • Not materialized in warehouse
  • Compiled as CTEs in downstream models
  • Saves storage but increases query complexity

Next Steps

Now that you have dbt running:

  1. Organize Your Models - Learn folder structures in Best Practices
  2. Explore Real Use Cases - See practical examples in Use Cases
  3. Build Something Real - Follow Step-by-Step Tutorials
  4. Join the Community - Ask questions in dbt Slack

Need Help?

Stay in the loop

Get weekly insights on data engineering, analytics, and AI—delivered straight to your inbox.

No spam. Unsubscribe anytime.