The world’s leading publication for data science, AI, and ML professionals.

My top 4 GeoPandas Functions: a practical story

I have been studying GeoPandas for the past couple of months, and it has quickly become one of my favourite Python packages.

Photo by June on Unsplash
Photo by June on Unsplash

I have been studying GeoPandas for the past couple of months, and it has quickly become one of my favourite Python packages. Personally, I am interested in how data science applies to fields such as urban planning, and Geospatial information and technology. For these, GeoPandas is especially useful since it allows users to manipulate geographic data and to extract valuable information from it. In this story, I will be drawing from my learning experiences to show you my top 4 functions from GeoPandas using a practical example.

1. Getting the available datasets

First off, if you are just starting with GeoPandas, you will need some easily accessible datasets. This can be achieved using the ".available" functionality like so:

Which produces the following results, each a dataset:

2. Points from xy

If you find yourself working with geospatial data, it might not always be in the proper format for you to start working on it right away. One such example is when your data is contained in Latitude and Longitude columns, and you need them to be in Point geometries. Thankfully, GeoPandas provides a function to quickly convert this data in the Point format. This function is called "points_from_xy". Let’s see it in action:

Using the dataset of the cities, let’s break up the Point geometry column in x and y coordinates and then add them as columns in the dataframe:

Here is the new dataframe:

Now, we can use the "points_from_xy" function to transform the latitudes and longitudes back into Point geometries:

Here is the result:

3. Making Maps

A large part of working with geospatial data is being able to effectively visualise them in maps. GeoPandas allows for a lot of refinement in how users represent map data, including adding different map layers and dealing with missing data. In this part, let’s look at how we can make a multi-layered map. First, we need to create a base layer. Say we will be looking at cities which are only located in the southern hemisphere. Let’s filter out this data and then plot it using geopandas.plot:

Now, let’s overlay these cities with a map of the world, provided in the GeoPandas available datasets. We are going to be adjusting the colours of the countries’ borders, the fillings of these countries, as well as the colours of the city points and their size:

Now, we have a layered map in which we can see some of the world’s southern cities and the countries’ borders.

4. Calculating Areas

For the fourth and final function, I chose the "area" function. This function does exactly what its name suggests: calculating the areas of shapes such as polygons and multipolygons. This can be very useful when manipulating geometric data as we will see below.

For this example, let’s use the "nybb" dataset which contains information about New York’s boroughs. As you can see below, the dataframe includes multipolygons representing the shapes of each of New York’s areas:

From here, it is pretty simple to calculate each of the polygon’s areas:

Here is the output:

Let’s use this result and plot the boroughs using GeoPandas plotting function, using a legend that colours them by area:

Now that we walked through 4 functions, I hope you were also inspired to start working with GeoPandas and explore it further. Thanks for reading and please contact me with any questions!


Related Articles