The world’s leading publication for data science, AI, and ML professionals.

How To Handle Map Projections Properly In Python

A walkthrough on both concepts and code in Python.

Photo by Simon Migaj on Unsplash
Photo by Simon Migaj on Unsplash

If you have ever tried overlaying Geospatial data and it turns out that they do not align properly, the chances are that you have encountered the annoying coordinate reference system problem. In this tutorial, we first clarify the concept and terminology of the Coordinate Reference System (CRS) followed by a tutorial on how to handle CRS in Python using Geospatial data.

Hey Mercator – Africa is Larger than North America

Maps are mostly flat and 2-Dimensional, whether they are on paper or screen. However, the earth is not 2-Dimensional. Therefore, maps always lie – sort of. To represent large parts of the earth, you have to make a trade-off between size, angle, shape or directions. There is no way you can get all of them right. So what is the best way to represent the earth on a flat surface?

xkcd
xkcd

Well, it depends! That is why we use Mercator projection for most of the web maps today. The Mercator projection preserves all angles in their correct shape, and that means if yo measure the angle using this projection, you get the right direction in the real world – useful in map navigation.

However, you can not compare different country sizes in web Mercator as it does not preserve size. The African continent is much larger than it appears and also Canada and Rusia take-up a large surface while in reality, they only occupy 5% of the earth surface.

Source
Source

Geographic vs Projected

There are two closely related concepts here that need clarification. We have the Geographic Coordinate System, which is mostly used for global mapping. Geographic Coordinate systems usually use Latitude and Longitude. In contrast, there are multiply local projections that bring visual distortions, which is called the Projected Coordinate System. Universal Transverse Mercator coordinate system, State Plane and Robinson projections are the most widely used projections.

In Geographic Coordinate systems, the unit of measurement is decimal degrees which help locate places on the earth. However, to measure distances in other units like meters or feet, we need projections.

CRS with Geopandas

With just a basic understanding of Coordinate reference systems and projections, it is straightforward to handle them properly in Python, thanks to Geopandas. Let us read three different datasets to illustrate the importance of handling projections properly.

import geopandas as gpd
import matplotlib.pyplot as plt
# Read world Countries
world = gpd.read_file(
gpd.datasets.get_path("naturalearth_lowres")
)
# Read world cities
cities = gpd.read_file(
 gpd.datasets.get_path("naturalearth_cities")
)
# Read Graticules 
graticules = gpd.read_file(
 "ne_110m_graticules_10/ne_110m_graticules_10.shp"
)

To know which Coordinate Reference System the data, Geopandas has .crs() , which can provide information about the data and its reference system.

world.crs

Let us see and understand the output of this method.

<Geographic 2D CRS: EPSG:4326>
Name: WGS 84
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
Area of Use:
- name: World
- bounds: (-180.0, -90.0, 180.0, 90.0)
Datum: World Geodetic System 1984
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich

First, the data is in Geographic 2D CRS with the World Geodetic System (WGS84). We usually use codes for different coordinate systems. The WGS84 has this code EPSG:4326. You can find these codes from EPSG.io website or search it in the Spatial Reference Organization.

We need to check if all the dataset has the same CRS before plotting them together.

cities.crs == world.crs == graticules.crs

This returns True since all the datasets have the same coordinate systems. So let us go ahead and plot the data on a map.

fig, ax = plt.subplots(figsize=(12,10))
world.plot(ax=ax, color="lightgray")
cities.plot(ax=ax, color="black", markersize=10, marker ="o")
graticules.plot(ax=ax, color="lightgray", linewidth=0.5)
ax.set(xlabel="Longitude(Degrees)",
 ylabel="Latitude(Degrees)",
 title="WGS84 Datum (Degrees)")
plt.show()
World map in WGS84
World map in WGS84

Notice also the x and y-axis which has decimal degrees.

Projections

To project any Geographic data, Geopandas has also .to_crs() method which takes a projection code. Let us reproject the data. Let us start with Eckert IV Projection.

The Eckert IV projection is an equal-area pseudocylindrical map projection. The length of the polar lines is half that of the equator, and lines of longitude are semiellipses, or portions of ellipses. – Wikepedia

world_eckert = world.to_crs("ESRI:54012")
graticules_eckert = graticules.to_crs("ESRI:54012")

Now, let us print out the CRS of any of these projected datasets. The world Eckert projected dataset has the following parameters.

<Projected CRS: ESRI:54012>
Name: World_Eckert_IV
Axis Info [cartesian]:
- E[east]: Easting (metre)
- N[north]: Northing (metre)
Area of Use:
- name: World
- bounds: (-180.0, -90.0, 180.0, 90.0)
Coordinate Operation:
- name: World_Eckert_IV
- method: Eckert IV
Datum: World Geodetic System 1984
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich

Notice the difference between the world dataset and world Eckert projected dataset. The later one has Projected CRS while the first had a Geographic 2D CRS. The axis of the first dataset was ellipsoidal with degrees as a unit of measurement, while the later one has cartesian coordinates with a metre as a unit of measurement.

Now, let us try to plot these two projected datasets and the unprotected city dataset. Since we will plot these maps several times, let us make a function and plot them.

def plot_map(gdf1, gdf2, gdf3, name,unit):
     fig, ax = plt.subplots(figsize=(12,10))
     gdf1.plot(ax=ax, color="lightgray")
     gdf2.plot(ax=ax, color="black", markersize=10, marker ="o")
     gdf3.plot(ax=ax, color="lightgray", linewidth=0.5)
     ax.set(xlabel="X Coordinate -"+unit,
            ylabel="Y Coordinate -" +unit,
            title=name
            )
plt.show()
plot_map(world_eckert, cities, graticules_eckert, "Eckert IV - Cities not Projected", "Metre")

Where are the city points dataset?

Mismatched Projection map
Mismatched Projection map

The cities are clustered in one place. Most of the time, the problem of misaligned layers in the GIS world is due to mismatched CRS and projections. We can fix this by also projecting the cities dataset to Eckert IV projection.

cities_eckert = cities.to_crs("ESRI:54012")
plot_map(world_eckert, cities_eckert, graticules_eckert, "Eckert IV - With projected Cities", "Metre")
Correctly matched world map - Eckert Projection.
Correctly matched world map – Eckert Projection.

Finally, let us bring out another projection and also show the differences between any of the projections out there. Let us go with Robinson projection.

world_robinson = world.to_crs("ESRI:54030")
graticules_robinson = graticules.to_crs("ESRI:54030")
cities_robinson = cities.to_crs("ESRI:54030")
plot_map(world_robinson, cities_robinson, graticules_robinson, "Robinson Projection", "Metre")
World Robinson Projection
World Robinson Projection

The above two projections might look like the same from the first glance, but they are different. We can look at overlayed maps of the two projections.

Eckert Projection (Blue) and World Robinson Projection(Red)
Eckert Projection (Blue) and World Robinson Projection(Red)

Any calculations done on different projections result in different results, for example, area size or distance.

Conclusion

In this article, we have explained the basic concepts of Geographic Coordinate Reference System (CRS) and Projected Coordinates. We have also seen how to handle them correctly in Python using Geopandas.


Related Articles