The world’s leading publication for data science, AI, and ML professionals.

5 Geospatial Tips and Tricks in Python

Part 1: How to easily and effectively incorporate spatial features in Python using Geopandas.

Photo by L B on Unsplash
Photo by L B on Unsplash

Dealing with geospatial data is often seen as cumbersome in the Data Science world. We seldom bother to include spatial features in our machine learning models. It is partly due to the complexity of Geometries and Geographic coordinate reference systems. However, I tend to get a boost in my machine learning models and better insights incorporating spatial features.

During the past month, I have been sharing Geospatial tips and tricks in Python on Twitter, and It has received a lot of attention from the Geospatial community.

In this article, I will share the first five tips and tricks of dealing with Geospatial data with Geopandas. I used a code beautifier online tool to display the code.

Tip #1: Read Geospatial data directly under a zipped folder

We compress our data to reduce space and transfer quickly. Therefore, we often tend to unzip or decompress it before reading the data. However, that is not necessary. With Geopandas, you can read zipped files directly, like this.

Reading Geographic data zip files.
Reading Geographic data zip files.

This technique also works with subfolders in a zip file. You just need to point it to the subfolder.

Tip #2: Plot missing data with a separate category

It is not unusual to have missing data in your datasets. Creating Choropleth maps with missing data does not differentiate having zero or null values and thus aggregates them into one category.

However, if you need to have a separate category for null values, you can achieve it as shown in the following code snippet.

Geopandas has .plot() method for all its geospatial data visualization interface, and it can take keyword argument, missing_kwds where you can provide the colour to use and what to call the category.

The result is this beautiful map with missing values category in the legend coloured as light grey.

Map with Missing Values
Map with Missing Values

Tip #3: Export GeoDataFrame to PostGIS database

We often use PostGIS for storing and analyzing Geospatial data. Set up your Spatial databases with PostgreSQL and transfer the data like a pro using Python and Geopandas.

Exporting from Python to PostGIS
Exporting from Python to PostGIS

Tip #4: Speed up Spatial Operations with PyGEOS

Spatial indexes are the running engines of spatial operations. It often takes a longer time to process, but with the addition of PyGEOS integration in Geopandas, you can speed up and boost your performance.

All you need to do is to set use_pyges to True.

PyGEOS
PyGEOS

I have seen a significant improvement in my processing time using this new PyGEOS integration. So give it a try if you perform spatial operations with Big data.

Tip #5: Add a base map to Geopandas Plots with Contextily.

Although we take for granted, base maps contextualize our maps. With the latest releases of Geopandas, it is possible to overlay your maps with different base map providers.

The following snippet shows how to incorporate base maps in your Geopandas plots.

Contextily Base maps
Contextily Base maps

And your dots on a map have contextual and rich visualization base maps.

Base map
Base map

Conclusion

Dealing with Geospatial data does not need to be hard. With Geopandas, it does many of the heavy liftings you need to deal with geospatial data effectively. In this article, we have shared five different tips and tricks for spatial data in Python.

If you like to follow these tips and tricks as I post them on Twitter, You can find them at @spatialML


Related Articles