The world’s leading publication for data science, AI, and ML professionals.

A data-driven guide on choosing who to fly with

Continuing from previous post on airline preference by country, today I will look into the detailed rating of 30 most reviewed airlines on…

Continuing from previous post on airline preference by country, today I will look into the detailed rating of 30 most reviewed airlines on Skytrax.

Which Airlines are in similar tiers?

Given the detailed rating on 7 sub-categories including seat comfort, cabin_staff, food and beverages, inflight_entertainment, ground_service, wifi connectivity and value_money, we can hierarchically cluster the airlines, and the result look quite reasonable.

We observe the following clusters:

  • budget airlines (Ryan air, Jetstar, Easyjet)
  • lowly rated North American airlines (United, US Airways, Air Canada Rouge)
  • somewhat better rated European airlines (Virgin Atlantic, Air France)
  • decently rated airlines from Asia and Middle East (Cathay, Singapore Airlines, Qatar, Emirates)

One side note is that detailed rating may not capture overall rating of an airline if there’s other factors that matters to passengers yet not captured by the attributes, such as safety.

How do clusters of airlines differ from each other?

To further understand the pros and cons of each airline, we examined the detailed ratings.

We can see the attributes of these clusters:

  • Air France and Virgin Atlantic have low rating on wifi connectivity
  • Budget airlines rated lowly on wifi and inflight entertainment but high on value for money. Why is there even such an expectation here?
  • Airlines like Delta and British Airways score decently, again, wifi and ground service is relatively lower
  • Thomson and Thomas Cook, two regional British airlines, actually have decent rating on wifi
  • North American airlines like United are rated average in every attribute
  • Asian/Middle East airlines like Emirates and Singapore Airline scored the highest comparatively (given no one really got ratings of 5)
Detailed rating in same color as the previous clusters
Detailed rating in same color as the previous clusters

The implication of this is that you could choose the flight depends on what matters the most to you on air, be it wifi or seat comfort. Airlines could also evaluate their price competitiveness with competitors of similar ratings.

Which attribute matters the most to overall rating?

Examining the correlation between each attribute and overall rating, we can see value for price matters the most, service from both cabin crew and ground staff are the next most important, followed by seat comfort and food. Movies and wifi don’t really matter that much.


What I learnt today is radar chart in R. Radar chart can be a bit tricky when showing too many categories with drastically different rating, in which case the ordering of category may distort the story. Here since most rating are between 3–4 so it’s a decent choice in understanding rating in each category.

library(fmsb)
#need to add two rows on max and min range of each category
df=rbind(rep(5,7), rep(0,7), df)
#plug in the numerical columns
radarchart(df[,2:8])

This is #day47 of my #100dayprojects on Data Science and visual storytelling. Full code on my github. Thanks for reading. Suggestions of new topics and feedbacks are always welcomed.


Related Articles