Hands-On Tutorials
Displaying Geographic Information Using Custom Map Tiles
Learn how to create custom tiles for your interactive maps
Interactive maps are now a staple of our everyday digital life. We use them to learn of our whereabouts, plan the next trip, or review our past travels. In a professional setting, maps became priceless tools for all manner of businesses in planning, operations, and analytics.
An interactive map displays a patchwork of square tiles, each containing a small part of the complete image. A cloud-based service provides these tiles, either by retrieving them from a cache or generating them on-the-fly. The map software manages the display as if showing a continuous bitmap but operates on a tile-basis in the background. Whenever needed, the map software requests new tiles from the service and discards them when they are no longer required. This situation occurs when the user changes the zoom level or pans the map in any direction.
When zooming in or out, the map software stretches or compresses the current display while retrieving the new level’s tiles. It then overlays the new tiles on top of the old ones, providing a pleasant visual continuity.
There is nothing better than a live map example to help build your intuition of what is going on behind the scenes. Please follow the previous link and select the “Show tile borders” option. Browse the map, see how the software juggles the tiles, how it stretches them, and how it replaces and refreshes the content. When hovering each tile with the cursor, the software displays the tile’s coordinates.
Tile Coordinates
We can conceive the whole tileset as a two-dimensional space where each tile receives a unique set of integer coordinates for any zoom level. The first zoom level encompasses the entire mappable surface of the Earth and puts it into a single tile. Each subsequent zoom level splits the previous level’s tiles into four, thus doubling the detail and the integer coordinates’ range. At zoom level one, both x and y coordinates range between zero and one, while at level two, they vary between zero and three. This process repeats up to the maximum supported zoom level, usually between 18 and 23. We can say that the tuple (x, y, zoom) uniquely identifies a tile.
Interactive map software like Leaflet and its Python wrapper Folium use this type of tile addressing when issuing server requests. The server converts these coordinates into the tile’s geospatial shape, a square in latitude and longitude space, and uses these to query the underlying data. With these data, the server renders the tile’s graphic content and sends it back to the map software client.
Quadkeys Revisited
We can also address each tile through its corresponding quadkey. A quadkey uniquely encodes a tile into a single string or number, convenient for cache or dictionary keys. Depending on the integer encoding scheme, we can include the zoom level or leave it out as contextual information. Quadkeys are very convenient because we can easily calculate them from tile coordinates and back. We will use them here to encode file names for the local tile cache, as database keys, and as the base for an algebra that will ease plenty of computations.
We can encode quadkeys using either strings or sixty-four-bit integers. The string encoding uses one character, either zero, one, two, or three, per zoom level and has the advantage of keeping the zoom level as the string’s length. Its most significant disadvantage is the storage size required to store each quadkey.
Fortunately, it is easy to compact the string encoding into something far more manageable, a sixty-four-bit integer. The conversion is relatively easy to do once we realize that the string representation of a quadkey is nothing more than a base-four number. The most straightforward conversion encodes the string into an integer but loses the zoom level information. We must somehow, implicitly or explicitly, store the zoom level information somewhere else. Alternatively, we can use the whole sixty-four bits to encode both the key and the zoom level information, but this solution limits us to twenty-three levels only. In this article, I use the former encoding as the zoom level is always available from context, and it supports encoding the required twenty-six zoom levels.
Using the image above as an example, the string encoding of the bottom-right corner tile, “333,” encodes the integer 63. But there’s more to quadkey encodings than meets the eye. Using either the string or integer encodings, we can immediately derive the enclosing zoom level's tile key. For the string encoding, we remove the rightmost character, while for the integer encoding, we perform an integer division by 4. This property has an exciting implication when handling 256x256 bitmaps representing tiles — each tile pixel is another tile eight zoom levels down, which will work wonders for us.
Map Overlays
Interactive maps realize their usefulness with overlaid geographic information. Interactive map software usually allows the overlaying of two different types of data: vectors and bitmaps. Here, we focus on a specific kind of bitmap data, tile overlays. A tile overlay works in the same way as the base map tiles do, providing a geographic content tile for each base map tile.
Each tile is a square bitmap with the same dimensions as the map tiles and uses alpha compositing to reveal the underlying map information. This way, the overlay creator can draw just the data to display in the correct geographical location, not caring about how the underlying map is displayed, the approach we follow in this article.
To illustrate this technique, let’s pick a tile from the OpenStreetMap server that matches our area of study, the city of Ann Arbor in Michigan, USA.
Using the base tile’s coordinates, we can now pull the corresponding geographic data and generate the overlay tile. In this case, we calculate a bivariate normal distribution for each sampled location and add it (literally) to the map. Areas with higher density will show up in a greener shade.
By composing the two images above using alpha compositing, we get the resulting bitmap (see below). Please note that the map software automatically handles this process.
Each dot you see in the overlay tile is actually a circle generated using a bivariate normal distribution. The distribution’s mean point is the location, and we consider a diagonal covariance matrix with a one-pixel standard deviation. Thus, a location expands to the following intensity matrix. The final image is the sum of all locations added together.
To avoid the inherent infinite size of a bivariate normal distribution, I decided to cut the representation when each cell’s value drops below 0.00001. Empty cells reflect such a situation.
Tile Generation
Tile generation is a three-step process. The first step consists of data collection and transformation into a format suitable for fast query and retrieval. The server draws the tile bitmap using the prepared data and stores it in a file cache for reuse in the second step. This article illustrates a lazy version of the second step where the server software generates and caches the tiles on demand. Should the tile be present in the cache, the server immediately delivers it to the client. During this process, the server code can compare the cached tile generation date with the current date and determine whether it needs refreshing. This process would keep the tile data up-to-date. The final step of the process is the tile delivery to the client. Here, I illustrate this process using a simple Flask-based API.
Before we start exploring the tile generation’s three-step process, we must know the end product. As you have seen above, the goal is to display traffic density information over a set of roads, so I convert each location to a Normal bivariate distribution and add them all to produce a color-coded density map.
Tile Data Collection
Our tiles’ source data is a long sequence of geospatial locations encoded as latitude and longitude pairs. For this article, I use the Vehicle Energy Dataset data that I have been exploring for some time.
As I previously stated, interactive map software requests the tiles one at a time and pastes them together to create the final map or overlay. The serving software needs to be fast when retrieving the tile’s data to improve the user experience. Depending on the zoom level, this may be a lot of information to collect, as lower zoom level tiles contain more information. My approach to solving this challenge is to pre-calculate all tiles and store the data for each supported zoom level. Quadkeys are a lifesaver here.
As we have seen before, each tile is uniquely addressable by a quadkey code. Each tile consists of a 256x256-pixel bitmap, which means we can address each pixel as another quadkey code eight zoom levels deeper (256 = 2⁸). This insight allows the efficient encoding of tile data for later retrieval during the generation phase.
My solution to this problem was to use an SQLite database to put all the tile data with per-pixel aggregation, using one table per zoom level. Each table’s structure is simple, with just three columns. Below I show the SQL table creation script for the lowest zoom level. Note that by aggregating the geographic data at zoom level 26, we can draw tiles up to level 18 only.
The first column contains the sixty-four-bit encoded quadkey code for a single pixel at zoom level 26. As such, we can very quickly convert this value to the pixel coordinates within the tile and draw the pixel according to the calculated intensity value, the table’s third column.
The second column encodes the enclosing tile, also as a sixty-four-bit quadkey code. We create a non-unique index for this column to make the tile data retrieval very fast and calculate it as the pixel quadkey code divided by 256, or 2⁸.
When the whole level 26 computation is complete, we can immediately derive level 25 through a simple aggregation, as illustrated by the SQL script below.
To calculate all zoom levels, we need to repeat this process to the topmost level, eight in our case.
We can now perform the final step of data preparation, namely calculating the intensity range for each zoom level. This information is essential for the coloring process while drawing the tile bitmap, as the value range maps to a predefined color gradient.
Now that the data is fully prepared, we can proceed to the tile generation and serving process description.
Tile Generation and Serving
For this article, I devised a straightforward Flask-based API to serve tile files. The API endpoint receives as parameters the tile coordinates and returns the corresponding tile PNG file. Here I am using a generic function to do all the heavy lifting.
As parameters, the function accepts the tile coordinates, the path to the SQLite database containing the tile data, and the file cache folder's path. The procedure starts by limiting the zoom level to the accepted range, between one and eighteen. Beyond these limits, it merely returns the default empty (fully transparent) tile.
For proper zoom levels, the function computes the target tile filename using the tile quadkey code, and if this file already exists in the cache, serves it immediately. For nonexistent files, the function must render the tile and save it before serving.
The process of generating a new tile starts by establishing a connection to the database containing the zoom level data. It then queries the database for all the tile’s pixel intensities. Should the tile be empty, the function serves the default transparent tile.
The code uses lists of tuples of the individual pixel coordinates, as quadkey codes, and their respective intensities to represent tiles with data. These lists must then convert into tile-based pixel coordinates, meaning that each tile's top-left-hand corner has the (0, 0) coordinate. Then, the function collects the intensity range information for the zoom level at hand. With all this information, we can now paint the tile.
Painting tiles with NumPy and PyPNG
While researching on creating PNG files from Python code, I came across an elegant package: PyPNG. This package can convert NumPy arrays into PNG files, which seems like a great idea. Here is how you create a NumPy array that represents a 256x256 RGBA image, encodable as PNG:
Painting a tile is a simple matter of setting the individual pixel values to the appropriate color. The next function uses the list of pixels, the color gradient, and the suitable zoom range to paint a tile.
Each color value from the gradient list is a NumPy vector with four dimensions, one for each channel component, so setting a pixel is a simple assignment. The function that generates the gradient list also sets the alpha channel value to 50%.
Saving the tile to a PNG formatted file is straightforward with the PyPNG package. The code below illustrates the process. Note the required array reshape before saving.
Finally, the API can serve the tile file by creating a response object around it.
Using the Code
To use the code, start by cloning the GitHub repository to your local machine. The first step is to execute the first two numbered Jupyter notebooks to create the supporting database. This will read the data from the distribution dataset and import it into a local SQLite database.
Next, you must create and populate the supporting SQLite tile database. You do so by running the following script:
python generate_densities.py
Please note that this script may take a very long time to run on the VED data. Expect more than one hour of total runtime. Once finished, you can start the tile API using the following script:
python tileapi.py
This command starts a Flask server listening on port 2310. To see the tile server in action, please run the Jupyter notebook number ten. You will see a map centered in Ann Harbor, Michigan. If all went well, you should start seeing the tiles being rendered over the map. As you pan and zoom, the tile server will generate, cache, and serve the appropriate tiles.
Conclusion
In this article, we have explored the concept of interactive map tiles and how to generate them to convey custom geographic information dynamically. Interactive mapping software uses square bitmap tiles to build the whole map. To make these maps useful, we can overlay vector or raster images to convey geographically referenced information. Overlayed map tiles are convenient and fast to display such information but usually require some lengthy preprocessing. This article took you through the paces to deliver a custom solution that you can further adapt.
Resources
João Paulo Figueira works as a Data Scientist at tb.lx by Daimler Trucks and Buses in Lisbon, Portugal