DltGetting Started

Getting Started with dlt

This guide will help you install dlt, run your first pipeline, and understand the core workflows.

8 min read

Getting Started with dlt

This guide will help you install dlt, run your first pipeline, and understand the core workflows.

Installation

Prerequisites

  • Python 3.8 or higher
  • pip (Python package installer)

Install dlt

Basic Installation:

With Specific Destination:

Verify Installation:

Your First Pipeline

Simple Example: Load Data to DuckDB

Step 1: Create a Python file (my_first_pipeline.py):

]

Create pipeline

pipeline = dlt.pipeline( pipeline_name="my_first_pipeline", destination="duckdb", dataset_name="test_data" )

Run pipeline

load_info = pipeline.run(data, table_name="users")

Print results

print(load_info)

Output:

Step 3: Query the data:

Or use DuckDB directly:

Loading from an API

Example: GitHub API

Using Verified Sources

dlt provides 50+ pre-built sources you can use immediately.

Initialize a Verified Source

This creates:

Configure Credentials

Edit .dlt/secrets.toml:

Or use environment variables:

Run the Verified Source

Run:

Configuration

Project Structure

Config Files

.dlt/config.toml (Non-sensitive):

.dlt/secrets.toml (Sensitive - add to .gitignore):

Environment Variables

Alternative to config files:

Loading to Different Destinations

DuckDB (Local)

BigQuery

Setup:

.dlt/secrets.toml:

Pipeline:

Snowflake

Setup:

.dlt/secrets.toml:

Pipeline:

PostgreSQL

Setup:

Pipeline:

Incremental Loading

Tracking New Records

Tracking Updated Records

Handling Schema Evolution

Automatic Schema Evolution

Controlling Schema

Error Handling and Monitoring

Check Load Results

Logging

Retry Logic

Working with DataFrames

Querying Loaded Data

Using SQL Client

Using Destination-Specific Tools

Common Patterns

Loading Multiple Tables

Pagination

Transforming Data

Troubleshooting

Common Issues

Issue: Module not found

Issue: Credentials not found

Issue: Schema evolution errors

Issue: Pipeline runs but no data loaded

Debug Mode

Next Steps

Now that you've created your first pipelines:

  1. Explore Verified Sources: Try dlt init --list-sources
  2. Add Incremental Loading: Reduce data transfer
  3. Integrate with dbt: Transform loaded data
  4. Set up Orchestration: Use with Airflow/Dagster
  5. Deploy to Production: See Best Practices

Recommended Learning Path

  1. Load data from simple API
  2. Add incremental loading
  3. Load to production warehouse (BigQuery/Snowflake)
  4. Integrate with dbt for transformations
  5. Orchestrate with Airflow
  6. Monitor and optimize

Ready for production? Check out:

Stay in the loop

Get weekly insights on data engineering, analytics, and AI—delivered straight to your inbox.

No spam. Unsubscribe anytime.