Mapping the Real World

Pratyush More
Towards Data Science
5 min readFeb 26, 2018

--

Fig 1. Administrative divisions of China and Taiwan on (Left) an equal-area map, and (Right) a Flow-Based Cartogram where areas are proportional to GDP.

We all think we know how the world looks. But do our maps of the world always reflect what is most important? Much has been said, for instance, about the astonishing economic growth witnessed by China over the past few decades. China’s geographical map in Figure 1 (Left), however, tells us nothing about the extent of the Gross Domestic Product (GDP) growth in different parts of the country.

What if we could produce more visually informative maps?

As it turns out, we can. Figure 1 (Right) presents a Chinese map with regions rescaled according to their respective GDP contributions. Not only does it portray the real China that people talk about in the context of economic progress, but also displays the severe distortion and inequality in the country’s development. It is a hard-hitting and evoking depiction of the ground reality - West and Northeast China are practically absent from the great Chinese success story.

Maps such as the one in Figure 1 (Right) are known as cartograms. Regular maps prioritize geographical accuracy in terms of shape, size, and location. Cartograms, on the other hand, rescale map regions according to statistical data such as population or GDP, which might be of greater interest for a particular application, while still maintaining topological geographical features as far as possible.

In addition to providing shock and alarm, cartograms also offer statistical advantages over regular maps in visualizing spatial data. As an illustration, take the map describing the results of the 2016 US presidential elections in Figure 2 below. The states colored red were won by Donald Trump, of the Republican Party. The blue states were won by Hillary Clinton, of the Democratic Party.

Fig 2. 2016 US presidential election map. Red indicates Republican victory. Blue indicates Democratic victory.

A quick glance at this map would indicate a landslide victory for Donald Trump. This erroneous conclusion arises because areas of regions on the map do not reflect the quantitative data they represent. For example, Idaho (ID) and Rhode Island (RI) both hold 4 electoral votes. But Idaho occupies a dominantly visible place on the map, whereas Rhode Island is barely noticeable.

If, instead, we used a cartogram based on the electoral votes of each state, to represent this same data, we would obtain a much more visually representative portrayal of the election results. Such a cartogram is included below in Figure 3. The areas colored blue and red now accurately portray how many electoral votes were won by the Democrats and the Republicans respectively.

Fig 3. US presidential election results on a Flow-Based Cartogram. Red indicates Republican victory. Blue indicates Democratic victory.

The diffusion-based cartogram (link) has largely been the most popular cartogram in use. This cartogram generation technique finds inspiration in the physical process of diffusion. The mathematics defining the diffusive movement has been carefully studied and documented in scientific literature, and lends itself well to solving the problem at hand.

As an analogy, consider spraying some perfume in one corner of a room. After a few minutes, you would be able to smell the fragrance in the other parts of the room as well. This is because the scent diffuses from areas of high density (the initial corner in which it was sprayed) to areas of low density (the rest of the room). Similarly, each region on the map can be thought to be made up of particles. For a population cartogram, then, particles in regions of high population density would flow outward and spread, thus enlarging these regions, and consequently shrinking regions of low population density.

That said, such cartogram generating methods have been rather slow. They can take tens of minutes on present-day hardware to create even simple cartograms such as the ones used in this article. The time-consuming process of creating cartograms is a huge deterrent in the widespread adoption of this data visualization technique.

To accelerate the cartogram generation process and facilitate their common use, I, along with Professor Michael T. Gastner and Vivien Seguy, researched and devised a new algorithm, to generate what we call a Flow-Based Cartogram. Please read our research publication in the Proceedings of the National Academy of Sciences of the USA.

In summary, when generating cartograms, we wanted to equalize the density at all locations on the map. While diffusion was one way to do so, it was not the only way. The new mathematics underlying our technique led to algorithmic efficiencies, and enabled us to break the computation into small, unrelated parts which could be done independently, thus making use of multicore processors which are in wide use today.

The result of our work was a 60-fold increase in speed of the cartogram generator. Each of the cartograms presented in this article was generated within a matter of seconds. The China cartogram took our method less than 3 seconds to produce, and the US election cartogram 1.5 seconds. As a further example, the following cartogram of India in Figure 4 (Right), where states are rescaled according to their respective GDP contributions took 2.6 seconds to produce using our Flow-Based method.

Fig 4. States and union territories of India on (Left) an equal-area map, and (Right) a Flow-Based Cartogram where areas are proportional to GDP.

Cartograms are a powerful tool to present hard-hitting representations of ground truths, and thus effectively increase awareness of these truths among a large populace. I hope that through our research we can reduce the barriers to create cartograms, and thus bolster their adoption. With this goal in mind, we also make available our software on GitHub, along with instructions on how to use it. Enjoy a gridded cartogram of the world, created using this software, and rescaled according to population, in Figure 5 below.

Note: I am working on a Python package to make cartogram generation further simpler. I also have plans to build a web app to achieve the same end. Both of these projects will be open-sourced. If you would like to collaborate on either, please email me at pratyushmore1996@gmail.com.

Research Paper: PNAS Article

Software: GitHub

Fig 5. Flow-Based Cartogram of a gridded world map, rescaled according to population.

--

--