Watch this to see how we set-up and run dbt and BigQuery on the cloud shell

Get started with BigQuery and dbt, the easy way

Find here the quickest way to get started with dbt and BigQuery using only free offerings from Google Cloud.

5 min readJul 28, 2020

Important update: I left Google and joined Snowflake in 2020 — so I’m unable to keep my older posts updated. If you want to try Snowflake, join us — I’m having a lot of fun ❄️.

I’m a big fan of dbt — an open source project that helps me build data pipelines around BigQuery using only SQL.

Get started with BigQuery and dbt

There’s a lot already writen about Bigquery and dbt. For example, there’s this official tutorial to set up dbt with BigQuery, with a lot more details than I do here (thanks Claire Carroll). The goal of this post is to share with you some GCP secrets to make the installation as easy as possible.

Step 1: Create a free Google Cloud account

Good news: You don’t need a credit card to have your own Google Cloud account. You’ll get a free terabyte of queries in BigQuery every month, and also a free shell environment you can use through your browser.

While creating your account, you’ll also create your first Google Cloud project. We’ll use the id of it later.

BigQuery without a credit card: Discover, learn and share

If you ever had trouble signing up for BigQuery, worry no more — now it’s easier than ever to sign up and start…

towardsdatascience.com

Step 2: Welcome to the free Cloud Shell

On console.cloud.google.com click on the “cloud shell” icon on top, and a shell environment will be ready for your use in a minute or so:

Step 3: pip3 install dbt

Once in the cloud shell, installing dbt is really easy. To avoid problems skip installing the full dbt, but just install the dbt-bigquery parts with:

$ pip3 install --user --upgrade dbt-bigquery

Notes:

pip3 instead of pip, to make sure we are on the Python 3 world.
--user to avoid installing at the root level.
--upgrade just in case an older version of dbt was installed.

Step 3.1: debug

If you get an error like /usr/local/bin/python3.7: bad interpreter: No such file or directory, uninstall dbt and reinstall.

$ pip3 uninstall dbt-core
$ pip3 install --user --upgrade dbt-bigquery

Step 4: start your first dbt project

$ ~/.local/bin/dbt init first_project

Notes:

~/.local/bin/ is the path to the just installed dbt. Consider adding this directory to the PATH env, to avoid the need of prepending it.

Step 4.1: try to run dbt

$ cd first_project/
$ ~/.local/bin/dbt run

That could have run! But it didn’t. We got an error message like Credentials in profile “default”, target “dev” invalid: Runtime Error instead. That means we still need to configure a way for dbt to connect and authenticate to BigQuery. The good news: This is really easy, since we are already in a Google Cloud environment.

Step 5: configure dbt to BigQuery

Use vim or your favorite shell editor:

$ vim ~/.dbt/profiles.yml

In this file, replace the default configuration (which goes to redshift… why?), and instead point it to Bigquery:

default:
        target: dev
        outputs:
                dev:
                        type: bigquery
                        method: oauth
                        project: the-id-of-your-project
                        dataset: temp

Notes:

oauth is all you need to say — the authentication libraries will recognize that it’s you connecting through dbt to bigquery within the cloud shell — no further configuration is needed.
the-id-of-your-project should be the id of your project
temp is the name of a dataset you must create in BigQuery to hold your tables. Note that if you are using BigQuery without a credit card, then the tables in your datasets can only live for a certain number of days.

Step 5: run dbt

~/.local/bin/dbt run

If everything has gone well, now we have run dbt! That means it created a sample view and a sample table in our provided dataset:

Check Claire’s tutorial for more details on what happens next.

Step 5.1: the magic cloud shell editor

While editing your dbt files and configuration, check out the ready-to-use cloud shell editor. Launch it by clicking the “open editor” button in the cloud shell:

Open the cloud shell editor

Or just click on this link: ssh.cloud.google.com/cloudshell/editor

Step 6: the magic of docs

dbt is really good about generating documentation for our pipelines, and you can generate and serve it with the following commands:

$ ~/.local/bin/dbt docs generate
$ ~/.local/bin/dbt docs serve

The really magic part is this: If you are running dbt in the cloud shell, and dbt gives you a localhost (127.0.0.1) URL to contact, you might think it won’t be easy to connect. But it is. Just click on the cloud shell link, and magic will happen:

Turns out the cloud shell can create http proxies to give you direct access to any web server running on it — and this immediately gives you access to the docs being served by dbt.

Isn’t that neat?

Next steps

Write all your SQL pipelines within the dbt environment, and make sure to keep them under version control.
Find a better way for BigQuery ML models creation.

How to run dbt every day on a schedule, in a serverless way? Check Mete Atamel solution with Cloud Run and Cloud Scheduler.

More resources

Hamza Khan medium.com/weareservian/bigquery-dbt-modern-problems-require-modern-solutions-b40faedc8aaf

Mark Rittman rittmananalytics.com/blog/2020/2/8/multichannel-attribution-bigquery-dbt-looker-segment

Lak Lakshmanan medium.com/google-cloud/loading-and-transforming-data-into-bigquery-using-dbt-65307ad401cd

Scheduled serverless dbt + BigQuery service

My colleague Felipe Hoffa recently published a blog post titled Get started with BigQuery and dbt, the easy way. More…

medium.com

Want more?

I’m Felipe Hoffa, a Developer Advocate for Google Cloud. Follow me on @felipehoffa, find my previous posts on medium.com/@hoffa, and all about BigQuery on reddit.com/r/bigquery.

Get started with BigQuery and dbt, the easy way

Find here the quickest way to get started with dbt and BigQuery using only free offerings from Google Cloud.

Get started with BigQuery and dbt

Step 1: Create a free Google Cloud account

BigQuery without a credit card: Discover, learn and share

If you ever had trouble signing up for BigQuery, worry no more — now it’s easier than ever to sign up and start…

Step 2: Welcome to the free Cloud Shell

Step 3: pip3 install dbt

Step 3.1: debug

Step 4: start your first dbt project

Step 4.1: try to run dbt

Step 5: configure dbt to BigQuery

Step 5: run dbt

Step 5.1: the magic cloud shell editor

Step 6: the magic of docs

Next steps

More resources

Scheduled serverless dbt + BigQuery service

My colleague Felipe Hoffa recently published a blog post titled Get started with BigQuery and dbt, the easy way. More…

Want more?

Written by Felipe Hoffa