The world’s leading publication for data science, AI, and ML professionals.

Creating an interactive map of wildfire data using Folium in Python

How to plot points and polygons in Folium – an example using two wildfire datasets

Photo by Ross Stone on Unsplash
Photo by Ross Stone on Unsplash

Introduction

I’ve been very interested in wildfires recently, as 2020 was a record year for them here in Colorado. Loosely continuing on with the theme of my first article, I wanted to do a project on wildfires in the US. During my search for data, I found two awesome geospatial datasets to work with: a [Kaggle](http://kaggle.com) dataset of points stored in a SQLite database containing records for 1.88 million wildfires from 1992 to 2015[1], and a United States Geological Survey (USGS) dataset of polygon data[2] of wildfires from 1878–2019, stored in a shapefile. In general, Kaggle, and US Government agencies are excellent data sources for projects. In this article, I’ll show you how to make an interactive map of this data using open source software (Folium) in Python.

For this project, you’ll need GeoPandas, Folium and Branca (you also need Pandas and Numpy, but I assume you have those installed already 😃 ). If you do not have these libraries, pip install them from a Python command prompt or Jupyter Notebook:

pip install geopandas 
pip install folium
pip install branca

Part 1: Plotting points in Folium (Kaggle Dataset)

Here, I’ll run through how to plot up points in Folium from the Kaggle dataset. As the data is in a SQLite database, my workflow will be to read the table of interest from the database in to a Pandas dataframe, clean the data up a bit, and then plot the data in Folium for visual analysis.

First, import the necessary libraries, and establish a connection to the sqlite database using sqlite3:

Then, use the Pandas method ‘read_sql_query’ with a simple SQL statement to select all rows from the ‘Fires’ table. To know which table you are interested in for an analysis, I recommend using DB Browser for SQLite to briefly analyze the database before making this function call.

Next, I create a new dataframe with only the columns I’m interested in. Note that alternatively, you could have done this during the last step with a longer SQL statement:

#in lieu of dataframe manipulation in the next code block:
df = pd.read_sql_query("""SELECT FIRE_NAME, FIRE_YEAR, 
                       SOURCE_REPORTING_UNIT_NAME,
                       STAT_CAUSE_DESCR, FIRE_SIZE, 
                       LATITUDE, LONGITUDE, STATE
                       FROM Fires""", conn)

df.shape tells us that this dataset has 1,880,465 rows – displaying this much data on a map will surely be way too much information. So, I decided to create a new dataframe that only contains fires greater than 1000 acres:

This new dataset is only a fraction of the size of the original 1.88 million, containing 11,087 wildfires. Now, we’re ready to make the map! Note that ‘map’ is a Python keyword, so it’s common practice to store the Folium map in a variable ‘m’ , and call ‘m’ to display the map when you’re ready. When dealing with points data, create a variable for each ‘feature group’ that you want to have different characteristics for. You’ll also be able to turn the layer on/off independently from the other layers. I then iterate over each row in the dataframe, and use if statements to populate each of those feature groups.

Here’s what the finished map will look like. You can hover and pan around the map to view the wildfires. I used fire acreage as a function of the natural logarithm for the marker size (radius) keyword argument, which makes smaller fires appear as smaller circles and larger fires appear as larger circles. Smaller fires are colored in yellow, while bigger fires are colored in red:

I expected to see many fires in the western US. However, one thing that surprised me in this dataset was the amount of wildfires that have occurred in Florida and southern Georgia, including the ~309,000 acre Honey Prairie fire in 2011, started by lightning (Fig. 1).

Fires in Florida [1]. Image by author
Fires in Florida [1]. Image by author
Fires in Colorado [1]. Image by author
Fires in Colorado [1]. Image by author

Part 2 – Plotting polygons in Folium (USGS dataset)

The USGS data comes in a shapefile. We can use GeoPandas to read in the shapefile:

For this analysis, I’m only plotting wildfires greater than 100,000 acres. The reason I do this is because 65,845 polygons generates way too large of a file – even with only wildfires > 100,000 acres (280 polygons), this is still going to be a large html file as our final product. The proper way to deal with this much data would probably be to have the web map pull the data directly from a SQL database. Perhaps I will write about that in a future blog post 😃 .

Next, let’s import Branca colormap to generate a linear colormap for our polygons. I scaled the colormap from the size of the smallest fire to the 75th percentile of fire size (‘Acres’). This way, the largest 25% of wildfires will all be colored red, while most of the fires (smallest 75%) will exhibit a wide range of colors. I also defined a quick function to reverse the colormap. To view all available colormap options, call ‘cm.linear’.

Now we’re ready to plot up the polygons. For the colormap, you’ll need to make a dictionary with ‘FireName’ as the keys and ‘Acres’ as the values. I use the dictionary values as an argument in the lambda function for the variable ‘map colors’, and then pass ‘map_colors’ as a keyword argument for the style function in the GeoJson folium method.

And, we now have polygons!

Fires in Arizona [2]. Image by author
Fires in Arizona [2]. Image by author
Fires in Northern California [2]. Image by author
Fires in Northern California [2]. Image by author

You can view the full interactive Kaggle map here, the full interactive USGS map here, or view/download both the maps and notebooks on my Github repository. Note that the USGS map may load slowly in the link above on palkovic.org.

Cheers! Happy mapping,

Martin


References

[1] Tatman, Rachael, 1.88 Million U.S. Wildfires. Kaggle, 2020, https://www.kaggle.com/rtatman/188-million-us-wildfires

[2] Welty, J.L., Jeffries, M.I., 2020, Combined wildfire datasets for the United States and certain territories, 1878–2019: U.S. Geological Survey Data Release, https://doi.org/10.5066/P9Z2VVRT


Related Articles