In this second of a two-part series on fire data analysis from Toronto between 2011 and 2016, I analyze the severity of fires addressed by the Toronto Fire Services (TFS). In the first part of the series, general properties of the collection of fire incidents were analyzed, in particular with respect to when they occur, but with little to no consideration of the nature of the fire of the incident.
Accidental fires are great examples of a phenomena where extreme events matter a great deal more than typical events. A fire department that can handle 90% of all fires well, but which cannot deal with the top-10% severe fires, is not fit for purpose. Like earthquakes, floods, epidemics, insurance claims, and certain financial planning, extreme events, though rare, must be considered in any competent design. These highly consequential, but elusive events appear along with a host of named concepts: extremal events, fat tails, Paretian world, Black Swans. In order to keep the concepts simple and as intuitive as possible, I refer to this as extreme events.
Extreme events pose a fundamental challenge to a purely data-driven approach to analysis. By definition these are rare events and therefore in any finite data collection they are poorly sampled. In machine learning the problems of skewed or imbalanced classes is one manifestation of this. Though there are ways to navigate the pitfalls of dealing with such data, there are fundamental limits to what data can reveal about poorly sampled regions without a theory of the mechanics of the system to help fill in the gaps.
There are still useful ways to quantify the degree of extremeness, and acquire knowledge of the system under study. This is a subject of active research and the approach below is an introduction to some of what can be done.
How To Quantify Severity
In order to proceed, I need to quantify severity. Is it the area covered by the fire that determines the severity, or is it the duration of the fire, or is it the loss of property value, or is it the number of human lives at stake? The answer: yes to all.
In practice the TFS must make this determination as they deploy their limited resources. And this is what I will use as a proxy metric for severity: the number of dispatched units during the course of the fire incident. A fire deemed more severe by the institution and people tasked to address it, should result in a greater number of dispatched units (fire trucks, hazardous materials trucks, air supply trucks, command vehicles etc). The multi-objective definition of severity is reduced to a scalar property by the domain experts.
This is hardly a perfect proxy metric, an issue I return to at the conclusion of the text. But I think it is good enough to quantify salient properties of fire severity. Call it a starting point.
Observed Frequencies to Probability Density
In the scatter-plot below the number of incidents associated with a certain number of deployed units is shown; note the logarithmic scale on the vertical axis.

As expected, and by a sizable margin, most fires are of minor severity and only a very small fraction of incidents are very severe.
Upon closer inspection the number of incidents is not uniformly decreasing with the number of responding units over the entire range. In fact, the two most common values are 1 and 6 responding units. This variation is explained by rules the TFS operate by: even the smallest residential fires are addressed with six units.
This is a limitation of the severity metric with respect to the resolution of finer-grain details between fires. Since the goal of this effort is to characterize extreme events, though, these finer details should not be allowed to become the proverbial trees that prevent the forest from being seen.
I create a coarse-grained variant of the severity metric by binning the number of responding units into the bins [1,6], [7,12],… In other words, the difference between fires that have 1 to 6 responding units is removed from consideration – they are now considered equally severe. The scatter-plot transforms as follows:

The first bin is the sum of fire incidents that result in one to six number of responding units; the second bin is for incidents that result in seven to twelve responding units; and so on.
Log-Log and the Pareto Distribution
Note that the horizontal axis as well is logarithmic in the plot above. This is therefore a log-log plot. The dots line up quite well, barring two or three extreme values. A least-square line fit to the dots (excluding for now the ones at bin 33 and 76) yields:

In this equation S denotes the bin number, which is the coarse-grained severity scale and N{annual}_ the annual number of events of severity S. The equation is reorganized:

To make qualitative sense of this equation, consider what happens if the severity is doubled, that is S’=2S. Since 2⁴=16, the implication is that doubling the severity implies a reduction of number of occurrences by a factor sixteen. Differently put, there are sixteen times fewer fire incidents of double the severity. Furthermore, this statement is scale invariant, that is, no matter what the value of S is, double it and the frequency of incidents is reduced by a factor sixteen.
A frequency distribution of this form is known as a Pareto distribution. It is a class of distributions with interesting properties, and which has been canonized as the true way to understand the modern world, from investing to love. I will stay away from the general claims, and as far as this study is concerned simply note that to a first approximation, the probability of the severity of fires in Toronto is Pareto distributed with shape parameter equal to 4.
Put A Number on Extremeness of Randomness
A feature of a Pareto distribution is that above some threshold γ, the distribution has infinite moments. For example, the integral to evaluate the third moment of the above distribution is not finite; first and second moments, on the other hand, are. The French-American mathematician Mandelbrot characterized the state of randomness for a system by the value of γ. The lower the threshold, the more extreme the extreme random events are relative the typical random events. Mandelbrot defined the randomness of a distributions with a threshold between 1 and 2 as being wild randomness.
The threshold at which moments become infinite can be used as a way to characterize how extreme the extreme events are relative the typical events. This property does not require a Pareto distribution. An estimation of the threshold γ that is based off more relaxed assumptions about the underlying distribution is helpful.
One approach is based on a property of the following ratio based on sampled data X.

It can be shown that in the limit of infinite n, this ratio tends to zero if and only if the p-moment is finite. Therefore, conversely, the value of p that makes this ratio not tend to zero is the critical threshold.
I use the number of responding units (no coarse-grained metric this time) from the TFS data set from 2011–2016 and compute the ratio for a progression of more data points.

The estimation of asymptotic limits from a finite sample is risky business – after all we never reach the end. However, the diagram strongly suggests that both the first and second moments are finite. Somewhere in the range p=2.5 to p=3.5, the ratio seems to plateau at a value above zero.
One favourable feature of the above approach is that it is little affected by the exact nature of the distribution for low fire incident severity. In the line fit in the previous section, the points at low values of S mattered. Since the goal of this study is to characterize extreme events, rather than the entire range of fire incident severity, the deviations from a smoothly decreasing distribution for low severity as noted above, should ideally not impact the method of analysis.
Excesses Above Threshold to Isolate Extremes
An intuitive idea to further characterize the distribution of extreme events is to study exceedances. This is the distribution of events in which severity exceeds some high threshold value.
There is a wealth of theory developed around this approach. In hydrology this is referred to as the peaks over threshold approach, and can be visualized as how far above a threshold a river flows, ignoring the exact values for times the threshold is not exceeded. In particular, asymptotic relations with respect to this threshold can be proven.

The above time-series line diagram shows a snippet of the fire incident time-series, where a threshold at 20 responding units is represented. It is the statistical features of the values exceeding the severity threshold that can be put in relation to the features of the extreme event distribution.
As before, whenever estimating limiting quantities, this time the threshold, with a finite sample, there is as much art as science at play. Without getting into the details, the shape of the tail distribution is found with this method to be in the range 2.5 to 4.5, depending mostly on if the most extreme severity case is included or not in the consideration.
The handful of methods employed to characterize and quantify the extreme end of the fire incident distribution points to exponents (shape parameter) in the range from high two to low four. If we adopt 3.5 as a compromise, the qualitative implications for scaling up towards extreme events are: a fire incident double in severity is ~11 times less frequent, above some moderate severity threshold.
What Is Measured and What Matters?
In parts of the analysis above I have omitted the two most extreme incidents on the severity scale. They do not seem to quite fit in. Am I committing the crime of forcing reality to fit my pretty equations à la Pythagoras? Perhaps. It might be that the extremeness at very high levels has been underestimated, and that the shape parameter should be lowered for the tail distribution.
The two incidents in question are a storage facility on fire, with plentiful of fuel that kept the fire alive for more than a day, and a mattress factory in flames, this as well lasting for very long due to ample fuel to burn. The high numbers of responding units in part reflect the extended duration of these fires, since multiple units had to come and go. It is possible the severity proxy metric I adopt rates fires that take a long time to terminate as exceedingly severe.
In analysis of real-world data the property that actually matters to the question or prediction of interest can be ambiguous and therefore tough to practically quantify. Fire severity would not be the first. Discerning formulations and engineering of great metrics that measures what actually matters can make all the difference. That analysis precedes any creative use of databases, algorithms or GPUs.
To conclude, the cavalcade of techniques has characterized the extreme end of severity of Toronto fire incidents, given a natural shortage of extreme data to learn from. In order to create a predictive approach, maybe even with spatial and temporal resolution, further refinements are needed. Among these is the severity metric, which has its limitations as far as it pertains to the meaningful real-world aspects. But we have to start somewhere to figure out what problems needs solving, and extreme events are no doubt worthwhile to address.
Claps are currency. If you found this worth reading, be kind and press the button.
Notes
- TFS incident raw data from the City of Toronto open data webpage.
- This book by Embrechts et al. is a helpful introduction to extremal event modelling.
- Visualizations are done with Tableau, most quantitative analysis with Python libraries Pandas and Numpy.