Where is the population center of the world?

Where we find a circle on the globe where 50 percent of people live inside the circle

Paul Hiemstra
Towards Data Science

--

Much of the power in the world is concentrated in western countries like the US and Europe, making those places very prominent on the world stage. But where is the center of the world if we go by population numbers? Inspired by a Real Life Lore youtube video I went looking for the smallest circle one can draw on the Earth where 50 percent of people live inside the circle, and 50 percent outside. The center of that circle could be dubbed the population center of the world.

In this article we will delve into a lot of geography related topics, on old passion of mine dating back to my days as a PhD. You will learn how to:

  • Read grid data into Python
  • Intersect a grid dataset with a polygon
  • Transform data between geographic projections
  • Draw circles across a round Earth on a flat map, also know as Great Cirle Distances

In the next few sections I will slowly towards our solution, starting with the population data underlying all of our analyses.

This article and the code are also available on github.

The population source data

At the core of our analysis is the population data. I chose to use a 1km grid population dataset that was published on WorldPop. This dataset was collated by a number of universities under a Bill and Melinda Gates Foundation grant. It covers the entire globe, and provides the amount of people living in a particluar gridcell:

Visualisation of the world population grid data

I chose to use the most recent dataset they provide, which is of 2020:

array([[-3.4028235e+38, -3.4028235e+38, -3.4028235e+38, ...,
-3.4028235e+38, -3.4028235e+38, -3.4028235e+38],
[-3.4028235e+38, -3.4028235e+38, -3.4028235e+38, ...,
-3.4028235e+38, -3.4028235e+38, -3.4028235e+38],
[-3.4028235e+38, -3.4028235e+38, -3.4028235e+38, ...,
-3.4028235e+38, -3.4028235e+38, -3.4028235e+38],
...,
[-3.4028235e+38, -3.4028235e+38, -3.4028235e+38, ...,
-3.4028235e+38, -3.4028235e+38, -3.4028235e+38],
[-3.4028235e+38, -3.4028235e+38, -3.4028235e+38, ...,
-3.4028235e+38, -3.4028235e+38, -3.4028235e+38],
[-3.4028235e+38, -3.4028235e+38, -3.4028235e+38, ...,
-3.4028235e+38, -3.4028235e+38, -3.4028235e+38]], dtype=float32)

The read command produces a numpy array with all the population data. A striking feature are all the -3.4028235e+38 values, which the metadata of the file reveals to be NA values. This mostly covers the waterbodies, where nobody lives. Next we convert these values to proper NA values and sum up all the non-NA values:

7.966754816

So according to the dataset, 7.9 billion people lived on the Earth in 2020. This is nicely in line with what I expected. For later use we repackage this code into a function that determines the total population size of a .tif grid file:

7966755000.0

Boxing in the population data

In our first step we added up all the population gridcells on the map. Obviously, in our quest to find the circle containing half the population we do not want to add up the entire map. As a first step we are going to determine the total population inside a small bounding box. To do this we are going to determine which of our gridcells is inside the box, and which is not. In geography jargon this is called intersecting the bounding box with the gridcells. First we construct the bounding box we are interested in:

Image by Author

which nicely shows that we are using the bounding box of the Netherlands as our first target.

The first solution I explored involved converting each of the gridcells to a geographic point, and performing the intersection between the box and the points. The issue with this solution is that it takes an enormous amount of RAM, and my Python session would simply crash each time I tried to generate the points. To remedy the situation, I chose to outsource the work to GDAL. GDAL is a geospatial processing library with bindings for Python.

In GDAL I can use the warp tool to perform the required analysis. This however expects all the input data to be files on disk. So first we have to save the bounding box to disk, and then we can call GDAL:

Here we use the cutline ability of Warp to cut down our grid to only the gridcells inside the bounding box, and dumping that result as a new tif file. After that, we can simply use our get_population function to calculate the total number of people living inside the box:

31.186402

which shows that 31 million people live inside the box. This nicely aligns with the 17 million people living in the Netherlands, combined with the population centers in Belgium and Germany that also fall into the box. For convinience we wrap these steps into a new function for later use:

31.186402

Upgrading the box to a circle

But our challenge was to perform this operation with a circle, not a box. Luckily, the entire workflow given above is not limited to a box shape, but works for any polygon type shape. So, our next challenge is to construct a circle around a specific center point with a particular radius.

A problem here is that the population grid we use is defined in latitude-longitude coordinates. This projection of a round Earth on a flat plane prevents us from simply drawing circles on the map. The solution I found takes the lat-lon center point we provide and reprojects it to a projection that works with meters distance. Then we construct a circle in that projection, and project the circle back to latitude-longitude (WGS84):

Image by Author

Which nicely shows that our perfect circle of 2200 kilometers translates to an oval in the latitude-longitude projection. This explains why Greenland is shown so much bigger on maps then Africa. With our Greenland circle ready for use, we can call the get_population_in_shape function:

1.81848775

This confirms that not a lot of people live in or around Greenland, 1.8 Million to be precise.

Finding the center of population

In the Youtube movie, the presenter names a 3300km circle around the town of Mong Khet to present the smallest circle where 50 percent of the worlds population lives inside the circle. Using our tools and data we can check this:

Image by Author

which shows that according to our data we are close, but only 48 percent of the world lives inside the circle. A bit of tinkering around with the radius shows that we can get a precise 50 percent if we expand the circle a bit:

3983.822592
0.5000559

That my circle needs to be a bit bigger could be related to the particular dataset I used.

Based on these tools, a good next step could be to use an optimisation algorithm to try and determine the exact location and radius where we can find the smallest circle that covers 50 percent of the Earth's population.

This article and the code are also available on github.

Who am I?

My name is Paul Hiemstra, and I work as a teacher and data scientist in the Netherlands. I am a mix between a scientist and a software engineer, and have a broad interest in everything related to data science. You can follow me here on medium, or on LinkedIn.

If you enjoyed this article, you might also enjoy some of my other articles:

Population data attribution

WorldPop (www.worldpop.org - School of Geography and Environmental Science, University of Southampton; Department of Geography and Geosciences, University of Louisville; Departement de Geographie, Universite de Namur) and Center for International Earth Science Information Network (CIESIN), Columbia University (2018). Global High Resolution Population Denominators Project - Funded by The Bill and Melinda Gates Foundation (OPP1134076). https://dx.doi.org/10.5258/SOTON/WP00647

--

--