Getting Started with dlt

This guide will help you install dlt, run your first pipeline, and understand the core workflows.

Installation

Prerequisites

Python 3.8 or higher
pip (Python package installer)

Install dlt

Basic Installation:

With Specific Destination:

Verify Installation:

Your First Pipeline

Simple Example: Load Data to DuckDB

Step 1: Create a Python file (my_first_pipeline.py):

]

Create pipeline

pipeline = dlt.pipeline( pipeline_name="my_first_pipeline", destination="duckdb", dataset_name="test_data" )

Run pipeline

load_info = pipeline.run(data, table_name="users")

Print results

print(load_info)

Output:

Step 3: Query the data:

Or use DuckDB directly:

Loading from an API

Example: GitHub API

Using Verified Sources

dlt provides 50+ pre-built sources you can use immediately.

Initialize a Verified Source

This creates:

Configure Credentials

Edit .dlt/secrets.toml:

Or use environment variables:

Run the Verified Source

Run:

Configuration

Project Structure

Config Files

.dlt/config.toml (Non-sensitive):

.dlt/secrets.toml (Sensitive - add to .gitignore):

Environment Variables

Alternative to config files:

Loading to Different Destinations

DuckDB (Local)

BigQuery

Setup:

.dlt/secrets.toml:

Pipeline:

Snowflake

Setup:

.dlt/secrets.toml:

Pipeline:

PostgreSQL

Setup:

Pipeline:

Incremental Loading

Tracking New Records

Tracking Updated Records

Handling Schema Evolution

Automatic Schema Evolution

Controlling Schema

Error Handling and Monitoring

Check Load Results

Logging

Retry Logic

Working with DataFrames

Querying Loaded Data

Using SQL Client

Using Destination-Specific Tools

Common Patterns

Loading Multiple Tables

Pagination

Transforming Data

Troubleshooting

Common Issues

Issue: Module not found

Issue: Credentials not found

Issue: Schema evolution errors

Issue: Pipeline runs but no data loaded

Debug Mode

Next Steps

Now that you've created your first pipelines:

Explore Verified Sources: Try dlt init --list-sources
Add Incremental Loading: Reduce data transfer
Integrate with dbt: Transform loaded data
Set up Orchestration: Use with Airflow/Dagster
Deploy to Production: See Best Practices

Recommended Learning Path

Load data from simple API
Add incremental loading
Load to production warehouse (BigQuery/Snowflake)
Integrate with dbt for transformations
Orchestrate with Airflow
Monitor and optimize

Ready for production? Check out:

Use Cases - Real-world scenarios
Best Practices - Production patterns
Tutorials - Hands-on projects

Getting Started with dlt

Getting Started with dlt

Installation

Prerequisites

Install dlt

Your First Pipeline

Simple Example: Load Data to DuckDB

Create pipeline

Run pipeline

Print results

Loading from an API

Example: GitHub API

Using Verified Sources

Initialize a Verified Source

Configure Credentials

Run the Verified Source

Configuration

Project Structure

Config Files

Environment Variables

Loading to Different Destinations

DuckDB (Local)

BigQuery

Snowflake

PostgreSQL

Incremental Loading

Tracking New Records

Tracking Updated Records

Handling Schema Evolution

Automatic Schema Evolution

Controlling Schema

Error Handling and Monitoring

Check Load Results

Logging

Retry Logic

Working with DataFrames

Querying Loaded Data

Using SQL Client

Using Destination-Specific Tools

Common Patterns

Loading Multiple Tables

Pagination

Transforming Data

Troubleshooting

Common Issues

Debug Mode

Next Steps

Recommended Learning Path

Stay in the loop

dlt (data load tool)

dlt Best Practices