
During the first lockdown in March 2020, in between Netflix episodes of Tiger King and The Last Dance, my housemate and I decided to learn Python.
He found a free course on some university’s website. It was terrible, and he’d quit within a week.
I found one that cost £10 on Udemy, and it was awesome.
But even a great course doesn’t compare to "real world" coding projects, so I still got bored and quit within a month.
Online courses are great – but only up to a point
When you’re first learning to code, you don’t want to faff around with setting up environments and downloading software.
You just want to open up your browser, click Start Course and get coding.
(That was why my friend quit – his course had too much setup faff, and it was too frustrating for a beginner.)
But after a while you realise that you’ll eventually need to learn how to work with cloud platforms and software.
Data scientists don’t write production code on Udemy’s in-browser code editor
But how can you practice coding in "real world environments" when so many online courses and platforms deliberately abstract those parts away?
My journey into the cloud
In this article—Part 1 of my miniseries on doing data science in the cloud—I’ll show you how you can practice coding in a "real-world" environment: Bigquery (part of Google Cloud Platform), which is great for practicing SQL.
Why BigQuery? First, because SQL is a staple for data science. Second, because it’s got a great free sandbox environment.
But, perhaps most importantly, because when I was just starting out with cloud platforms I found it to be very beginner friendly. Let me explain.
When I was first looking for Data science jobs, it terrified me to see job requirements like "AWS," "Google Cloud Platform," "BigQuery," "Azure" or "Databricks."
I had no experience with any of these platforms, and I wasn’t sure how to get it. I wrongly assumed that you needed to be some sort of IT boffin to use them.
Practicing SQL on BigQuery helped me debunk that assumption and gain confidence with that mysterious thing called "the cloud."
Besides, even if you don’t end up using BigQuery in your job, I think it’s a good introduction to cloud-based databases. It’s got a lot of similarities to other database systems like those on AWS, Azure, Databricks, Snowflake, etc., and if you learn BigQuery, you’ll be 80% of the way towards knowing those platforms as well.
Setup
Normally at this point I’d just direct you to the docs. But I personally found the BigQuery docs quite confusing as a beginner, so I figured it would be helpful to explain a couple of the key concepts here:
The BigQuery Sandbox
Google maintains a free environment called the BigQuery Sandbox, which lets you use BigQuery for free (no credit card required) and process a whopping 1TB of data per month. The reason I highlight this is because, somewhat confusingly, the sandbox is different than BigQuery’s Free Tier (which requires card details). When using the sandbox, we won’t get access to some of the fancier services of BigQuery. But trust me when I say that the sandbox is going to be more than enough for us.
To sign up, go to https://cloud.google.com/bigquery/docs/sandbox, click Go to BigQuery, __ and follow the prompts.

"Projects"
Once you’re inside the BigQuery Sandbox, you’ll need to create a "Project". When I was starting out with BigQuery, I wasn’t quite sure what that meant, but the basic idea is that a BigQuery Project is kind of like a directory which will hold all your data (e.g., any tables we create).
Note: If you use BigQuery in a corporate setting in the future, you’ll likely be asked to join a pre-existing project (not create your own), and your company will be billed per project (this makes it easy to organise costs). This is what happened to me when I joined my current company, for example. But remember, this doesn’t apply to us right now as we’re just using the sandbox!
To create a project, click Select a project and then New project in the pop-up box which appears. Give your project a name, and press Create project (note: you don’t need to specify an organisation).

Exploring the console
If you’ve not seen the inside of a cloud console before, it might look a little daunting, and you might find the docs a little too technical (as I did).
Here’s my quick "beginner tour":

- Project name – This shows the project in which we are currently working. In a corporate context, you might have lots of pre-existing projects to choose from. For now, we’ve only got one (the one we just created).
- Explorer pane – This is where we’ll see all our projects and the schema of all our tables we add to our database.
- Query tabs – each of these tabs represents a ‘Query’ (i.e., a place where we will write SQL code).
Let’s get some data!
Hopefully that gives you the core information you need to get started with BigQuery. Now, let’s practice running some queries with BigQuery’s free Public Datasets:
- Click the ‘+ Add’ button in the dataset Explorer pane on the left of the console and click Public datasets in the pop-up that appears.

- Search for ‘StackOverflow’ – one of the datasets we’ll use – and then click ‘View dataset’.

- Now, you’ll see some data set info and you’ll also see we’ve got a new drop-down that’s appeared in the Explorer pane –
bigquery-public-data
. If you expand that, then you’ll see tons of new datasets we’ve just been given access to! Take a quick look around – there’s loads of really cool stuff in there including Google Trends data, Google Ads data, etc.

Run a query
Let’s write a simple query, to get a feel for using SQL in BigQuery.
We’ll use the StackOverflow dataset, which contains tables on StackOverflow questions, answers, users and more.
Here’s a simple query which counts the number of questions asked in 2015, and the percentage of those which received answers:
SELECT
EXTRACT(YEAR FROM creation_date) AS Year,
COUNT(*) AS Number_of_Questions,
ROUND(100 * SUM(IF(answer_count > 0, 1, 0)) / COUNT(*), 1) AS Percent_Questions_with_Answers
FROM
`bigquery-public-data.stackoverflow.posts_questions`
GROUP BY
Year
HAVING
Year = 2015
ORDER BY
Year
When you click Run, you’ll see the output of the code. BigQuery also provides Execution details which give information about how long the code took to run.

More practice
If you want more practice, I’ve created 10 exercises which you can use to practice working with BigQuery. You can view them for free on my SQL tutorial website theSQLgym.
One more thing –
Thanks for reading. I hope you found this helpful, and free to connect with me on X or LinkedIn, or get my weekly Data Science/AI writing at AI in Five! 🙂