The world’s leading publication for data science, AI, and ML professionals.

How to Install dbt (data build tool)

Installing data build tool for your specific data warehouse

Photo by Markus Spiske on Unsplash
Photo by Markus Spiske on Unsplash

data build tool (Dbt) is undoubtedly one of the most powerful tools in modern data stacks as it allows teams and organisations manage and transform data models in a scalable, efficient and effective way. dbt will handle all data model inter-dependencies and provide you with everything you need to in order to perform tests over you data and improve the data quality of your data assets.

Depending on the data platform you are using, you will have to install some additional adapters to make dbt work and properly communicate with that platform. In the next few sections we will demonstrate how to install dbt and the required adapters within a virtual environment in order to get started with data build tool.


Creating a virtual environment

First we need to create a virtual environment which is an environment isolated from whatever is installed on the host machine:

A virtual environment is created on top of an existing Python installation, known as the virtual environment’s "base" Python, and may optionally be isolated from the packages in the base environment, so only those explicitly installed in the virtual environment are available. – Python Docs

python3 -m vevn dbt-venv

And then activate the newly created venv:

source dbt-venv/bin/activate

If everything was executed smoothly, you should be able to see a (dbt-venv) prefix in every line on the terminal.


Installing dbt-core

dbt offers two possible ways for interacting with the tool itself and run projects – one is on cloud and the other one through a command line interface (cli). In this tutorial, we will be demonstrating how to install the required packages that will let you use dbt from your local machine.

Therefore, the first dependency you need to install is dbt-core. The following command will install the latest version available on PyPI:

pip install dbt-core

If you wish to install a specific version, then you’d have to specify it in the installation command:

pip install dbt-core==1.3.0

Once the installation is completed, you can ensure that it has been installed successfully by running the following command that will simply print out to the terminal the dbt version installed on your local machine:

dbt --version

Installing dbt plugins for your data platform

Now in order for dbt to run successfully, it needs to establish a connection with the data platform that you (or your teams) use. data build tool can be extended to any platform with the use of an adapter plugin. You can think of these plugins as Python modules that are being used by dbt-core we installed in the previous step.

dbt Labs maintain some of their own adapters whilst some other were originally created (and are being actively maintained) by the community. You can find the full list of available plugins here. Below I’ll share some installation instructions for some of them:

BigQuery (Google Cloud Platform)

pip install dbt-bigquery

Athena

pip install dbt-athena-adapter

Postgres and AlloyDB

pip install dbt-postgres

Azure Synapse

pip install dbt-synapse

Databricks

pip install dbt-databricks

Redshift

pip install dbt-redshift

Snowflake

pip install dbt-snowflake

Spark

pip install dbt-spark

Next Steps

Now that you have successfully installed dbt-core and the required adapter(s) based on the data platforms that you are using, you are ready to create your first dbt project and the profiles required to interact with the target data platforms. I’ll share some more tutorials on how to do this in the next few days so make sure to subscribe and get notified when these articles are out!


Final Thoughts

If you haven’t already tried data build tool I’d highly recommend you to give it a go – the chances are you will be amazed about how it will help your team minimise the effort to build, manage and maintain data models.

In today’s short tutorial we went through the steps required in order to setup a dbt installation on your local machine. This guide will help you install the dbt CLI as well as the required adapters (based on your preferred data platform) that are required in order to create, manage, run and test data models.


Become a member and read every story on Medium. Your membership fee directly supports me and other writers you read. You’ll also get full access to every story on Medium.

Join Medium with my referral link – Giorgos Myrianthous


Related articles you may also like

A Visual Explanation of SQL Joins


2 Rules To Follow When Using GROUP BY in SQL


Diagrams as Code in Python


Related Articles