
Cloud computing in this era of big data is ubiquitous. However, many cloud services do not provide a location component to analyze and visualize geospatial data. Big Query, has built-in capabilities to ingest, process and analyze geospatial data.
In this tutorial, I will guide you through setting Bigquery Sandbox for free, processing spatial data with familiar PostGIS/Spatial SQL interface and visualize it right in the cloud.
Set up Free BigQuery Sandbox
Google generously offers a free sandbox which you can experiment. The BigQuery Sandbox gives you free access to try and experiment BigQuery functionalities with some [limits](http://The BigQuery sandbox gives you free access to the power of BigQuery subject to the sandbox’s limits. The sandbox lets you use the web UI in the Cloud Console without providing a credit card. You can use the sandbox without creating a billing account or enabling billing for your project. The web UI is the graphical interface that you can use to create and manage BigQuery resources and to run SQL queries. See the BigQuery web UI quickstart for a working introduction to the web UI.). With the Sandbox, you can use BigQuery without creating a billing account or enabling billing for your project.
They also offer 3-month, $300 free trial, which you might be eligible.
To use BigQuery sandbox, just go to this URL.
https://console.cloud.google.com/bigquery
Sign in with a Google Account (using an incognito mode preferably). Note that SANDBOX sign at the top left of the image below, which indicates that you are now in a free sandbox environment (10 GB of active storage and 1 TB of processed query data per month)

You can add data sources in the resources panel (highlighted blue in the left).In the middle panel, you have the query editor, where you can write your SQL syntax (Highlighted Green). Once you have your SQL ready, you can run the query using the run button.
BigQuery Public Datasets
Google has an extensive repository of Public datasets. At the time of writing this article, the number of available public datasets was 195. In this tutorial, we will use one of the open public datasets: Chicago crime. Feel free to identify an interesting dataset from the list and experiment with it, but all the examples in this article will be using Chicago Crime Data.
To add a public dataset, you need to click on+ sign ADD DATA
and then click on Explore public data set
. Choose an exciting dataset and click on it to view the dataset. In this example, we are using the Chicago Crime datasets.


Now that we have added public query datasets, we can query them. Let us see that in the next section.
Running GIS Queries with BigQuery
You can now run standard SQL queries to explore public datasets. However, since these datasets are usually large, you can run select statements with limiting the number of rows to preview the first few rows of the dataset or look at the schema of the table. However, You can preview the schema and few rows of the dataset without running any SQL code and saving the running query cost.
To view the schema of the dataset. Click on the dataset and in the lower button, and you will see the Schema
of the table (shown below).

If you want to preview the first few rows, you can then click on the Preview
, then you should see some rows of the dataset as shown below.

In the next section, we will go through the geographic features of BigQuery.
Creating and Processing Geographic Features
Our table has longitude and latitude columns; therefore, we can transform these values into geographic features using SQL geography functions. If you are already familiar with PostGIS, BigQuery GIS SQL syntax should also be typical.
We can create a geography column using ST_GeogPoint
function. Let us see an example.

Let us go through the SQL syntax first. We select the primary_type of the crime and create a geographic point using ST_GeogPoint
and pass the latitude and longitude columns. We also remove all NULL values in longitude column since we can not make a geographic point with coordinates.

If you look at the image above, you can save the results, but more importantly, BigQuery has a GIS visualization where you can plot maps interactively. To visualize Geospatial data, click Explore with GeoViz
and then a new window will pop up. Click on Authorize
and sign in with your account. You should now get the SQL syntax you have run earlier copied in BigQuery Geo Viz tool. Click on Run
. You should now see a base map with the results of the SQL, in this case, crime points.

If the data is already in Geospatial Format, running a simple SQL query will be enough. Say, for example, we want to get zip codes in Chicago. Big Query has a public dataset on geo_us_boundaries, and we can run a select statement including the geom column.

Now, we can visualize the results with BigQuery Geo Viz tool.

Now, that we can query Geographic datasets in BigQuery, Let us move on to a more advanced example using spatial join functions.
Spatial Join
Spatial join is one of the most widely used spatial processes. We use spatial join when we want to join data by location. Let us say we want to join each crime point to its zip code. We can use ST_Within
function to check whether the point is inside or outside the zip code.

The above SQL query joins Chicago crime points to zip codes. We are using ST_Within
function and pass the geographic points from the crimes and zip code geometry. The result is the table below where each point is matched with its zip code.
If we would like to count the number of crimes in each zip code, we can add a group by statement. The following SQL query returns the count of crimes in each zip code.

The result is shown below. We counted all crimes in each zip code.

You realize that we do not have any Geographic features because we could not group by a geographic feature. However, we can run with-statement queries to get the zip code geometry as well as the count of all crimes in each zip code.
Let us try that (See the SQL query below). There is nothing new here, we first run the usual group by a statement with spatial join and then joined to the actual zip code geometries.

The query result is shown below. You can see now that we have included the geometry column in our results.

Since we have a geometry column, we can also use the Goeviz tool to visualize the results. I needed to style the map, and therefore I used the counts’ column to plot a choropleth map (See below map).

Conclusion
In this article, we have covered some basic geographic functionality of BigQuery using the free Sandbox. After getting familiar with the BigQuery Interface, we have created and run simple queries using Geographic functions of BigQuery GIS. Finally, we have carried out an example of joining data by location.
There is a large number of Geographic functions in BigQuery GIS. You can find them here.
Geography functions in Standard SQL | BigQuery | Google Cloud