Scrutinising airline efficiency by visualising public aviation data

Visualising UK airline data to understand drivers of efficiency and highlight aviation’s environmental impact

Aine Fairbrother-Browne
Towards Data Science

--

Photo by jet dela cruz on Unsplash

Source code: github

Listening to The Guardian’s Today In Focus recently, I was fascinated by the episode ‘The Scandal of Britain’s Ghost Flights’, in which the Guardian’s environment editor Damian Carrington discusses his investigative work looking into ‘ghost flights’. As defined in this piece, ghost flights are flights run at less than 10% of available passenger capacity, and according to the Guardian, between March 2020 and September 2021, 15,000 of these took off from of UK airports. It is not entirely clear why these occur, but a major contributor is thought to be airport ‘slot’ rules. These are minimum activity thresholds that incentivise airlines to run the majority of their scheduled flights in order to remain at a given airport.

Over the course of the COVID-19 pandemic, Britons largely stayed put, owing to the threat of infection, national lockdowns and travel restrictions. This caused air passenger numbers to nosedive, and in a move to support the floundering aviation industry, the government waived the slot rule. This meant that from April 2020, airlines could operate at 0% (rather than 80%) and still retain their airport slots.

Despite this, in a UK outgoing flight dataset that he obtained, Carrington observed even under this new 0% rule, airlines continued to run ghost flights, suggesting that slot rules are only part of the puzzle here.

The aviation industry has a massive environmental impact. An hour of flying releases ~250Kg of CO2, which is more than individuals from certain countries release in a whole year. It is therefore a grotesquely wasteful practice for airlines to run ghost flights, emitting huge quantities of carbon without necessity. Tackling this seems like low hanging fruit for emission reduction, and is perhaps something concrete that can be solved using efficiency improvements, compared with the major shifts in economic and political systems that will be needed to keep global heating below 1.5C.

I wanted to look further into this, and see whether there was any data I could visualise to better understand this issue. My first port of call was to look to the Civil Aviation Authority UK airline data, mentioned by Carrington in the episode. This is a monthly release of per-airline totals for things like available capacity, passenger numbers, flight numbers and kilometres flown. However, to replicate or expand on Carrington’s findings, I needed per-flight data. So I submitted an FOI to the CAA asking for exactly this…

I can confirm that information is held by the CAA within scope of your original enquiry. It is the CAA’s position, however, that the requested information is […] exempt from release.

… which was promptly refused. From what I could tell, the data is held, but the CAA do not have permission from airlines to release it. I was frustrated, but not surprised by this outcome, as Carrington needed to leverage a parliamentarian contact to obtain the limited data that he did.

Keen not to be defeated by red tape, I decided to look at the data available to me and try to glean some useful insights about airline efficiency without having to go through FOIs and leveraging high-profile contacts. If aviation practices can be scrutinised using this data, this represents an accessible way to make these issues known and arm the public with the information needed to hold airlines, airports and regulators accountable.

The data

I downloaded the UK airline activity data from the CAA website. For each month, the data is provided as a separate table, so I downloaded Table03 — All Services.csv — which contains all incoming and outgoing UK flights — for each month from November 2019 to January 2022 inclusive. The data consists of per-airline monthly totals for 15 variables. Below is a sample of the data showing three of these columns: airline_name, seat_km_used and seat_km_available.

+------------------------+-------------------+--------------+|      airline_name      | seat_km_available | seat_km_used |+------------------------+-------------------+--------------+| AIRTANKER SERVICES LTD |          71854000 |     18314000 || AURIGNY AIR SERVICES   |            881000 |       103000 || BA CITYFLYER LTD       |            972000 |        21000 || BLUE ISLANDS LIMITED   |            580000 |       149000 || BRITISH AIRWAYS PLC    |         750959000 |    179433000 || CATREUS AOC LTD        |            104000 |        19000 || EASTERN AIRWAYS        |           3592000 |      1185000 |+------------------------+-------------------+--------------+

I identified seat_km_used and seat_km_available as the main metrics of interest for this analysis. seat_km_available is a standard aviation industry measure of the passenger carrying capacity of an airline, which takes into account the number of seats and the distances those seats could fly. For example, if an airline has 1 Boeing 737–800 (capacity 189) and has 1 scheduled flight from London to Edinburgh (approx. 500km), then its available seat kilometres will equal 189 * 500 = 94500.

This metric is useful because it allows direct comparison of capacity between airlines with differing journey types (long/ short haul), and differing aircraft models in their fleets. Throughout this piece, I will refer to available seat kilometres as an airline’s ‘capacity’. Important definitions relating to this are as follows:

  • Available capacity: scheduled passenger carrying capacity.
  • Used capacity: utilised passenger capacity — i.e. seat used.
  • % used capacity: used capacity as a percentage of available capacity.

The analysis

For this analysis, I focused on the top five airlines by highest average available capacity. To understand airline capacity over the past three years, I plotted available capacity over time for each airline, visualising % used capacity using point size (fig.1). To track these variables in response to changing slot rules, I used data from pre-pandemic (November 2019) through to the present day (January 2022), indicating the time span of the 0% rule using shading.

Figure 1. A time course to show available passenger capacity of five UK airlines over time (November 2019 to January 2022). % capacity used is indicated by point size (larger points mean high % capacity used, smaller points mean low % capacity used). Shading indicates time covered by the 0% slot rule.

Without marking pandemic-related events on to the plot, the effect of the pandemic is clearly visible in the data. Airline available capacity falls in early 2020 as airlines shrink their operations, and in the years following, airlines attempt to increase capacity amid low passenger numbers which fluctuate in correspondence with disease control measures. To understand how airline capacity was affected by changing slot rules, and to take into account the affect of lockdowns, I split the analysis into three sections: pre-pandemic, pandemic and ‘post-pandemic’ (by which of course I mean the post-lockdown period where there is no longer legal enforcement of social distancing and disease control measures).

Without marking pandemic-related events on to the plot, the mark of the pandemic is visible in the data. Airline available capacity falls in early 2020 as airlines shrink their operations, and in the years following, airlines attempt to increase capacity amid low passenger numbers which fluctuate in correspondence with disease control measures. To understand how airline capacity was affected by changing slot rules, and to take into account the affect of lockdowns, I split the analysis into three sections: pre-pandemic, pandemic and ‘post-pandemic’ (by which of course I mean the post-lockdown period where there is no longer legal enforcement of social distancing and disease control measures).

1. Pre-pandemic (80% rule)

I first looked at the pre-pandemic period — where the slot rule was set at 80% — to gauge baseline airline capacity and usage. This period is visible in figure 1 between November 2019 and February 2020, and is characterised by high available capacity.

For a higher resolution view on capacity metrics during these months, I plotted available capacity against used capacity, indicating % used capacity as a label for each airline (fig. 2).

Figure 2. Correlation plots to show used capacity against available capacity for 5 UK airlines. % used capacity is indicated as a label for each airline.

In figure 2, perfectly matched airline supply and demand sits on the diagonal. Overall, airlines are operating at ~80% used capacity pre-pandemic. Virgin Atlantic Airlines slips below this in December, and down to 76% used capacity by January, whilst British Airways sits at 81–84%.

These % used capacity figures leave 17%, 11%, 12%, 7%, and 18% unused capacity for British Airways, EasyJet, Jet2, TUI and Virgin Atlantic Airways respectively. Perhaps these seem like negligible proportions of unused capacity, but let’s put this into absolute terms. I translated unused capacity into absolute number of empty flights by multiplying the number of flights by the % unused capacity, giving the number of flights per month that could have been entirely empty if all other planes were full (figure 3).

Figure 3. Barplot to show, for November 2019 to January 2020 inclusive, the number of empty flights each airline would have run if all empty capacity was converted to flights.

For British Airways, the largest airline, this unused capacity translates into ~3500–4000 entirely empty flights arriving at or departing from UK airports each month. This is a simplification of the true situation. It is that likely the majority of this unused capacity will be made up of low or moderately low capacity flights, with a portion coming from actual ghost flights. But I do think this paints a grim picture of aviation wastefulness, and asks questions about what is driving this. Let’s take a look at what happens when the 80% slot rule is waived.

2. Pandemic — 0% rule

By March 2020, COVID-19 cases were rising rapidly and on March 26th 2020, the first lockdown was instated. In tandem with this, demand for air travel plummeted, making it impossible for airlines to run 80% of their scheduled activities. As a result, the government waived the slot rule, meaning that airlines could operate at 0% and still retain their airport slots. When analysing this period of time, it is important to consider the timing of the UK lockdowns*, so I have indicated these in figure 4 using shading.

Figure 4. Time course as in figure 1, but with shading indicating the three national UK lockdowns.

I will break this part of the analysis down further, into three sections representing the three ‘phases’ of the 0% rule — the first lockdown, the inter-lockdown phase and the third lockdown — plotting monthly correlations of used and available capacity for each.

2.1 First lockdown (April 2020 — August 2020)

Figure 5. Correlation plots to show used capacity against available capacity for 5 UK airlines. % used capacity is indicated as a label for each airline.

The first lockdown was a period of instability for airlines, reflected in the fluctuations of available capacity. Figure 5 shows TUI and Jet2 for the most part maintaining low capacity, and largely filling it. Given this response, it is striking to see British Airways consistently running at 24–36% used capacity from April to August, and Virgin Atlantic Airways, despite reaching pre-pandemic % used capacity figures in May and June, falling to 14% used capacity in July and 17% in August.

2.2 Inter-lockdown phase (August 2020 — October 2020)

Figure 6. Correlation plots to show used capacity against available capacity for 5 UK airlines. % used capacity is indicated as a label for each airline.

In this inter-lockdown period, characterised by the continuation of the 0% rule and slightly heightened post-lockdown air travel demand, Virgin Atlantic Airways and British Airways again seem to vastly over-generate their available capacity. The former catastrophically so, using just 17–24% of its available capacity over the four months of this period.

2.3 Third lockdown (January 2021 — May 2021)

Figure 7. Correlation plots to show used capacity against available capacity for 5 UK airlines. % used capacity is indicated as a label for each airline.

The third lockdown again sees airlines running at very low % used capacities. British Airways, Virgin Atlantic Airways and Jet2 were running at <40% for the majority of this period, with EasyJet fluctuating between 26% and 43%. Despite this, as can be seen in figure 4, available capacity was maintained, seemingly unresponsive to these figures.

3. Post-restrictions — 50% rule

In November 2021, the slot rule was re-introduced, but set to 50% rather than 80%. Let’s look at this transition period on the time course plot.

Figure 8. Time course as in figure 1, but limited to October 2021 through January 2022, with labels added to indicate capacity used as a percent of available.

As the slot rule was introduced, most airlines boosted their available capacity. For EasyJet, TUI and Jet2, this resulted in a decline in % used capacity, as they increased available capacity to reach the 50% activity quota, but did not have the passenger demand to warrant this. Here, we see direct evidence for the slot rule forcing airlines to operate, and generate unused capacity.

*The start dates are clear from government sources, but end-dates are less clear-cut due to gradual lifting of restrictions, so I have used the lifting of majority of mixing restrictions as the approximate end date as per the UK government roadmap.

Conclusions

This analysis has highlighted the following:

  • In the pre-pandemic era, slot rules contributed to unused airline capacity by forcing airlines to reach activity thresholds. In the case of British Airways, this was reached by running the equivalent of 3500–4000 empty flights per month.
  • The contribution of the slot rule to unused airline capacity is further evidenced by the re-introduction of the 50% slot rule, and the hike in airline available capacity accompanied by a fall in % used capacity.
  • During the UK lockdowns, passenger demand was extremely low, and the slot rule was waived. Despite these facts suggesting that airlines could be less active, or indeed operate with greater efficiency, some were running at very low % used capacity, notably British Airways and Virgin Atlantic Airways.
  • The 0% rule data suggests that there is more at play than slot rules alone. However, it is not currently clear what the motivations would be to run ghost flights, or very low capacity flights, particularly when there is no risk of slot loss.

Initially I began looking at this data out of an interest in ghost flights, but I emerged thinking more about the staggering wastefulness of the aviation industry, and the large environmental impact that it has.

This analysis shows that we can use public data to keep tabs on airline efficiency to an extent. However, improved data transparency is needed to make regulatory bodies and airlines accountable. Scrutinising efficiency is one way that pressure can be exerted on the aviation industry by the public, with the goal of forcing a reduction of its substantial — and continuously soaring — carbon footprint.

--

--