Each year the Toronto Fire Services (TFS) are dispatched to between 9,000 and 10,000 fires in the city of 2.7 million inhabitants. The severity ranges from minor fires in grass or rubbish to major fires in warehouses or residential high-rises. In this study, the first of a two-part series on fire data analysis from Toronto between 2011 and 2016, I analyze properties of incident frequency, regardless of severity, through data segmentation and aggregation.
The data analysis is limited to reported actual fires. That means that the additional needs addressed by the TFS, such as medical emergencies, false fire alarms, and rescues from malfunctioning elevators, are excluded from present analysis. Also, fires that are handled without involvement of the TFS are also absent from the data.
The locations of the 85 fire stations in the city of Toronto are shown below.

A first simple observation from the data is that the total number of fire incidents in a year shows no meaningful trend in the six years under study. Changing regulations, fire safety education, economics, and so on, are not changing fast enough as far as fire incidents are concerned. All analysis will therefore treat the six years as identically distributed.
Evenings & Weekends: Time for Fire
Data analysis starts with data exploration, which implies Data Visualization. In the heat-map below the number of fire incidents as a function of hour of day and day of the week is shown. The darker the shade of green, the greater the number of incidents in the associated time segment.

The visualization shows that fire incidents take place to a greater extent in the evenings, to a lesser extent in the night and early morning, and that weekends are subtly yet noticeably different from the rest of the week.
The incidence data is further aggregated and the average number of fire incidents for a given hour of the day, for weekends and for Mondays-Fridays are shown next.

A few simple observations from the line chart: on average there are 1.2 dispatches to a fire in Toronto on a weekend between mid-night and 1 AM; the lowest average number of incidents, just over 0.3, is during 4 AM to 5 AM on weekdays.
Taking a bigger picture view of the line chart, and two additional observations of a general nature follow:
- There are more fires during weekends for almost all hours of the day (vertical upward shift of weekend curve).
- The relative frequency of fire incidents is shifted about one hour later in weekends compared to weekdays (horizontal rightward shift of weekend curve).
From this first stage of data exploration, it appears that in the urban environment of Toronto, a sizable portion of fires are caused by certain human activities that vary over time, rather than random lighting strikes or evenly dispersed electrical failures. That subset of activities are also more common in evenings. The shift to later hours on weekends suggests these activities, or at least a decent-sized subset thereof, are shifted in that manner as well.
Residential and Rubbish, the Interaction With Season
What about variations between seasons? The diagram below shows how the frequency of fire incidents depends on hour of the day as well as season and month of the year.

The familiar pattern of active evening hours is present. However, the curves of the warmer and brighter seasons are shifted both upwards and their peaks rightwards compared to colder and darker seasons. In other words, warmer and brighter means more and later fire incidents in Toronto.
TFS broadly classifies fires in terms of what is on fire: high-rise residential, grass or rubbish, vehicle on highway etc. The categories with the greatest number of occurrences are "Grass/Rubbish" and the union of all types of residential fires. By further segmenting the data on this categorical variable, then plotting the seasonal and hour-of-day variation, the following curves are obtained.

Residential fires (figure on the left) vary with time of the day with a peak around 6-7 PM, but there is no variation with season (the observable difference is not large enough to be distinguished from statistical noise). Fire incidents involving grass or rubbish (figure on the right) on the other hand are highly dependent on season. The frigid winters of Toronto understandably raise the threshold for outdoor fires. But is it the physics of the fuel for fires, or the outdoor activities of the people that are causing fires that is altered the most?
Incident Frequency Beyond the Average
Before I continue to drill down in the data, I will look beyond the average. All data visualizations so far relate to averages for various segments of time and most recently two segments of fire types. Each segment contains multiple data points, however, and it is often revealing, even necessary, in data analysis to characterize the spread within such groupings.
The histograms of incident number for two illustrative segments are shown below: first for winters Mondays to Fridays between 9 AM and 10 AM; second for summer weekends between 6 PM and 7 PM.


Consider the first of the two histograms. It visualizes facts of the data such as
- for nearly half (~49%) of all weekdays in winter during hour 9 (9 AM to 10 AM), there are zero dispatches to fires, and
- for a bit over one in every ten (~12%) days that is not a weekend, there are two dispatches to fires in winter during hour 9.
The average incident frequency for the first histogram is 0.7 – a reflection of that the spread of data in the segment is shifted towards low values.
Contrast this with the second histograms with an average of 2.3. The bulk of data is shifted towards slightly higher values, where for example days with 1, 2 or 3 incidents make up two-thirds of all days in this particular time segment. There is also a tail towards even higher values, such as 6 and 7 incidents.
Making Sense of the Histograms with Poisson
Both of the histograms resemble the probability density of the Poisson distribution. There are theoretical reasons as well why this should be the expected form.
The Poisson distribution has one parameter, conventionally denoted λ and called the intensity parameter.

I will not get into the details of this equation, its special features are described well elsewhere. The qualitative interpretation as it relates to the fire incident data is best understood as: The probability that k number of incidents occur in a given time segment is given by the equation, where the greater the intensity parameter λ, the higher the average frequency of incidents, which spreads the corresponding histogram towards higher values, though there always exists a value k beyond which the probability decreases.
The data exploration shows that the intensity parameter of the fire incident process varies with time. In other words, λ is a function of time.
This equates to the modelling of fire incidents as an inhomogeneous Poisson process. The fitting of λ to data can formally be done via maximum likelihood estimation. However, there is a convenient and quite good approximation available: the average number of incidents for a given time segment equals λ. For an exact Poisson distribution the mean value is in fact equal to λ.
In other words, all line charts above can be reinterpreted as the plots of λ, the Poisson process intensity parameter, as a function of time segment.
How Well Does the Poisson Process Describe the Data?
Quite well it turns out. The worst fit is for the histogram shown below.

The troublesome part are the few days with very high values. The rest of the histogram fits well to a Poisson distribution with λ=3.1. That value should make 10, 12 and 13 incidents extraordinary rare, however. This suggests that at least for this time segment, a minor but consequential additional process is present as well. Outliers can be instructive and point to places to dig next.
The Heterogeneous Spatial Segment
The data entry for most fire incidents includes a record of the nearest street intersection. Cross-referencing this with latitude and longitude geographic coordinates from a separate open data set enables a well-resolved picture of where in Toronto incidents take place. The image below shows most fire incidents in the data set, where blue dots are fires in residential buildings, amber all other types of fires (the absent incidents are discussed in the Notes).

At this high-level perspective the spatial segregation between blue and amber dots tells a simple story: residential areas in Toronto are not evenly dispersed, and there are spatial segments without residential houses or high-rises. Not an earth-shattering realization.
It is in principle possible to repeat earlier analysis along the different time-segments further divided over spatial segments. Could it for example be that some Toronto neighbourhoods have a distinct dependency on time reflective of underlying demographic, economic or architectural features?
However, as much as the nearly 60,000 data points of fire incidents can teach, once they are sliced and diced into fine segments, such as hour of the day, season of the year, day in the week, and neighbourhood, the data has become too diluted. Without a guiding hypothesis, data exploration is too easily lost in creative interpretations of statistical noise. Additional data to cross-reference with, and the creation of quantitative models, can bridge to a fuller understanding of the finer details of the city that make fire incidents different across the city. A resolved spatiotemporal model is not too far away.
Conspicuous in the Data: Ice Storm of December 2013
The map visualization of the incident data, when made dynamic with respect to month and year, makes one major deviation visually very clear. The animated image below visualizes the spatial distribution of grass/rubbish fire incidents in Toronto for a few months from the end of 2013 to early 2014.

December 2013 stands out from the otherwise smooth reduction of grass and rubbish fires in the winter months compared to spring and fall. Zooming in to the neighbourhood North York Centre, adding dates as labels, it is crystal clear that between 21st December and 26 December, something out of the ordinary took place.

Indeed, these were the days when a severe ice storm hit Toronto. Power lines broke under the burden of icicles, and individuals that elected to warm themselves with indoor fires did their part in raising the number of fire incidents.
The ice storm of December 2013 is an excellent reminder of that, as useful and mostly accurate a fitted inhomogenous Poisson process is to the quantitative modelling of fire incident frequency, major errors in the parameterization of λ can happen. When the underlying features shift beyond a critical threshold, in this case the ability of (at least) the electrical infrastructure to withstand an ice storm, the typical domain of fire incidents in the Toronto winter, is no longer representative. Crossing this discontinuity, and interpolations and extrapolations within the typical domain are simply in the wrong realm of reality. The hard problem, some argue impossible problem, is to know, before they are crossed, what these critical thresholds are.
Segments, Features-That-Explain and Cross-Referencing
So is the frequency of fire incidents simply a function of time of day, day in week and season? That compact model reproduces the observations rather well, so there is something to this simple conception. Maybe after all most fires are caused by stovetop blunders and cozy evening outdoor barbeques, in which case the features of the city that explain the dominant relations may not be that hard, and a predictive model can be built and validated through cross-referencing with other data sets.
However, the consequence of fire incidents is not uniform. The rare but large fires are hidden in a sea of minor grass and rubbish incidents. A model for the variations and dependencies of the many minor incidents certainly has utility, but it would mostly overlook the important extremes. That’s the topic for part two.
Claps are currency. If you found this worth reading, be kind and press the button.
Notes
- TFS incident raw data from the City of Toronto open data webpage.
- Geographic data on street intersections from the City of Toronto open data webpage.
- Not all fire incidents are associated with a street intersection; missing data (in particular for trails) and inconsistent punctuation also precludes matching of geographic coordinates for some incidents. 85% of all fire incidents are matched with geographic coordinate.
- Visualizations are done with Tableau, most quantitative analysis with Python libraries Pandas and Numpy, geospatial data parsing with Python library PyShp.