The world’s leading publication for data science, AI, and ML professionals.

Analyzing Disneyland Paris visitors reviews for parks and hotels – Part 1

Exploratory Data Analysis and Topic Modeling on more than 120k TripAdvisor reviews 📝

Word frequencies of all our English reviews for the Disneyland Paris parks
Word frequencies of all our English reviews for the Disneyland Paris parks

Introduction

Disneyland Paris is one of the most famous entertainment resorts in the world. It has two parks with shows, stunts and parades, special events, characters, and rides. It also has several hotels and restaurants and is ranked third place on Tripadvisor when looking at what to visit in France.

Get ready for a magical time at Disneyland Paris !

Is one of the first sentences that appear on the company website, and in general, the communication appears to be directed towards living a unique experience with family or friends.

With a little bit of context about the company and a good amount of reviews from TripAdvisor, let’s try, without specific expectation, to discover together what visitors who left a review are saying about it.

A word about our datasets

Before starting the analysis, a few details about the data we have. Our dataset is composed of a little more than 120k Tripadvisor reviews for Disneyland Parks and Hotels. They were fetched on March 12, 2021.

Disneyland Parks

  • Disneyland Paris → 41k reviews
  • Park Walt Disney Studio → 16k reviews

Combined into a single data set of 57,799 rows (216 MB)

Disneyland Hotels

  • Disneyland Hotel *** → 8k reviews**
  • Disney’s Hotel New York ** → 8k reviews**
  • Disney’s Newport Bay Club ** → 8k reviews**
  • Disney’s Sequoia Lodge *12k reviews**
  • Disney’s Hotel Cheyenne 11k reviews**
  • Disney’s Davy Crockett Ranch → 7k reviews
  • Disney’s Hotel Santa Fe10k reviews**

Combined into a single data set of 65,250 rows (217 MB)

Disneyland Nature Resorts

  • Villages Nature Paris **→ 3k reviews**

Analyzing Disneyland Paris parks 🎢

This article will only focus on the park reviews (Disneyland Paris and Walt Disney studio).

Why you may ask? Because there are more hotels (seven of them) and the reviews contain more fields such as sub-ratings (cleanliness, sleep quality, room, service, value) and we have to start somewhere!

Feel free to leave any comments and feedback, and if you like this article, we can make another one dedicated to the hotels in a second part 😉 . For now, we will proceed in two steps to try to get the most out of our reviews dataset.

Exploratory Data Analysis (EDA)

We are first exploring our dataset and try to come up with interesting insights like:

  • Visitor Satisfaction
  • Sentiment over time
  • Response time and rate

Text Analysis and topics discovery

We have a decent bunch of reviews, and we would like to understand the content of the text left by our visitors (mainly by extracting different topics or subjects). For that, we will look at things like:

  • Reviews length, emojis, and languages
  • Word frequency and N-grams
  • Topic Modeling with LDA and NMF

Now that we have everything in hand, let’s dive in!


Exploratory data analysis

How many reviews per month?

Tripadvisor was created in 2000 and Disneyland Paris in 1992. The Walt Disney Studio park opened a little later in 2002.

From the following graph, we can see that visitors really started to share and write reviews only from 2010.

Number of reviews per month from 2004 to 2021
Number of reviews per month from 2004 to 2021

It is really interesting to observe that most of the spikes correspond to the June – August periods and that there is an apparent seasonality.

From the beginning of 2019, we observe a declining trend, and almost no reviews left since March 2020 as the Covid-19 pandemic started.

Also by computing a CORR() function between Disneyland Paris and Studio Park, we see that the two lines have a 97% correlation, meaning they have the same visitor reviews seasonality and trend.

We can ask ourselves some questions, why is there no seasonality for winter periods (Halloween 🎃 or Christmas 🎄 )? Or, what could explain the continuous decrease of reviews in 2019?

SQL tips

Tripadvisor API returns dates in a string type format as follow 2016–03–17T12:40:44–04:00. In order to parse it, we use the PARSE_TIMESTAMP() function in BigQuery.

Overall ratings and average rating over time

In addition to leaving a written text, visitors can also rate their experience from 0 to 5 stars, which we converted into readable labels.

Reviews grouped by rating (1–5 stars)
Reviews grouped by rating (1–5 stars)

On all our reviews, it’s overall a great experience that is reported.

For the two parks, the visitors’ experience is more likely to be excellent or very good.

In general, it seems that visitors are really enjoying the experience but does this experience improve over time?

Let’s use an average rating over quarters, starting maybe from 2012 as it was from this year onwards that visitors started to leave a lot of reviews.

Average rating and relative change over quarters
Average rating and relative change over quarters

Don’t let the scale of the graph be misleading. We are looking at an average rating that varies between 3.7 and 4.3 stars, which remains a small interval. We could split this graph into two parts, before and after Q1 2018.

Before, it appears that from 2012 to 2018, the average rating is moving above 4 stars, with one of the biggest drops from Q1 2016 to Q3 2016, going down to an average of 3,95 stars.

Also striking, a continuous decrease from Q3 2018 to Q1 2020, reaching one of the lowest ratings.

SQL tips

To get quarters, we use a TIMESTAMP_TRUNC() function using a QUARTER parameter. Additionally, we are computing a relative change leveraging the power of the LAG() function in BigQuery.

Let’s try a different visualization using a green-amber-red indicator (basically converting our rating into a Negatif-Neutral-Positif indicator).

Red-Amber-Green indicator for reviews over quarters
Red-Amber-Green indicator for reviews over quarters

With only 8 reviews in Q1 2021, it doesn’t even show on our scale, but as seen on our first graph (Reviews per month) we have a declining trend from the end of 2017 until March 2021 meaning that the number of positive reviews is also decreasing, representing a smaller proportion of the total reviews.

Response rate, and response time

Reviews can get answered by the TripAdvisor page owner, in this case, the Disneyland Paris team. One metric that comes to mind is the response rate and the response time.

The Disneyland Paris Team started to answer on Tripadvisor for the parks at the end of 2016. They had a really good start answering most of the reviews, delivering an answer rate between 40% and 60% for at least three quarters, and then this rate dropped radically.

In general, Disneyland Paris has an average answer rate of 15%. This means that out of 100 reviews left on Tripadvisor, only 15 would get answered.

What about the response time?

Percentiles for response time in days
Percentiles for response time in days

The median value is 3 days, and getting an answer after 16 days is less likely to happen. In addition to this table, the average answer time is 5,7 days.

SQL tips

To count the number of answer in a Quarter, we convert the owner_response field as a boolean 1 or 0. Then, we can use a SUM() function to aggregate it over a quarter in a single query.


Text Analysis and topics discovery

The second part of our analysis will focus more on what our visitors have expressed in these reviews (we will try our best to find out 👀 ).

We will use different techniques that will help us to detect topics, subjects, or questions we should dive deeper into. As we have no idea of what is contained in these reviews, we will use mostly unsupervised methods to explore and discover.

Reviews length, emojis, and languages

Simply out of curiosity, let’s see if there is any pattern in the number of characters per review. For that, we count the number of words in a review and average them over months and years.

Average review length over time
Average review length over time

On average, our visitors use between 550 and 800 characters per review. It looks over time rather flat, nothing that really stands out.

Can emojis give us some funny indications? To note here, only 1,7% of our reviews contain emojis, which is a really low share of our review dataset.

Looking at the top-used emojis within the different ratings, we can see emojis related to love, laughter, and happiness for 4–5 stars and emojis related to money, anger, and confusion for 1–2 stars.

SQL tips

The emojis are extracted from each individual reviews. They are stored in BigQuery as a string format, encapsulated by two square brackets: [😎 ,🎉 ]

In order to parse this field, we use the JSON_EXTRACT_STRING_ARRAY() which will convert our string to an array of single values. It is really helpful to parse a comma-separated list of value.

Disneyland Paris is visited by tourists from all around, therefore, reviews might be written in many different languages.

On all reviews, we observe five main languages.

Reviews are for most of them, written in English. This could represent countries such as the US, Canada, or the UK.

France is the second most represented country, followed by Italy and Spain.

For making our topic analysis, we will only retain English reviews, which represent only 41% of our data set, meaning we would be missing out on other opinions or ideas from other countries and languages.

Word Frequency

Word frequency is a technique that measures the most frequently occurring words or concepts in a given text. We then display them as a wordcloud to help distinguish the most representative ones.

In our case, we removed from the reviews some words like "Disneyland" or "Paris" and all stop words ("the", "them", "your", "us" etc…)

Word frequencies of all our English reviews for the Disneyland Paris parks
Word frequencies of all our English reviews for the Disneyland Paris parks

For example, here we see the words "Queue", "Ride" appear most often in our reviews set. It’s hard to tell exactly what it suggests, but it might be interesting to note.

Let’s try to apply the same method, but split it between positive(4–5 stars) and negative(1–2 stars) reviews and see if we can notice specific words in each sub-sets.

Positive reviews are in green on the left and negative reviews are in red on the right
Positive reviews are in green on the left and negative reviews are in red on the right

Already, we can notice that some words are common to both wordcloud like "Ride", "Queue" or "Day". We could say that they apply to negative or positive feelings.

Therefore, we can observe words that mostly appear in one or the other sub-set.

  • "Time", "Staff", or "Food" are mostly occurring in the negative set
  • "Character", "Parade", or "Kid" are mostly occurring in the positive set

It is interesting but might be a little hard to interpret only one word out of its context. We will try to look at collocation to see what words commonly come together.

Python tips

Instead of displaying the wordcloud using a square image format, we use a custom .PNG image to display the words inside a Mickey Mouse shape. The wordcloud librairy allow us to use a "mask" parameter to use any custom image.

Collocation / n-grams

Collocation helps identify words that commonly co-occur. It can be helpful to identify hidden meanings in a sentence.

For example, we saw the word "Rides" was frequently mentioned. But maybe it is co-occurring with other meaningful words like "Closed Rides" or "Fun Rides".

Another example, in our previous wordclouds, we see the words "Fast" and "Pass". Looking over the internet, we __ found that Disneyland Paris offers a pass that allows visitors to get faster on some rides. We can assume that these words are more likely to co-occur together rather than be individual words.

To not _ get into overly complex analysis, we will look at bi-grams (two adjacent words like "Fast Pass") and tri-grams (three adjacent words like "Big Thunder Mountain_").

Our top found bi-grams by frequency

Top 20 occurrences of two adjacent words (bi-grams)
Top 20 occurrences of two adjacent words (bi-grams)

Some associations seem more likely to happen than others like "Disneyland Paris", "Fast Pass" or "Roller Coaster".

But more interesting maybe, we have "Go Back" or "Well worth" meaning visitors are maybe willing to return because their journey was worth it.

Another one "rides closed" might express that frequently attractions were closed and that is a recurring topic.

Our top found tri-grams by frequency

Top 20 occurrences of three adjacent words (tri-grams)
Top 20 occurrences of three adjacent words (tri-grams)

Already, we see these combinations of three words are a lot less frequent in our corpus of reviews.

It looks like several of them point towards the age of the children, like "6 years old", "3 years old" or "4 years old".

The "extra magic hours" and "character walking around" may be interesting features the park offers and interesting to look at.

The list of tri-grams is longer, but they might be less relevant as they appear in fewer reviews.

Python tips

We use a custom cleaning function to lemmatize and remove the stop words, then lower and fix encoding issues. Then, the NLTK librairy gives us the ngrams() function. Finally, we convert our results to a dataframe that we can ingest in BigQuery.

Topic modeling

Our goal is to automatically organize text by subject or theme. In our case here, we have a bunch of reviews and we want to figure out what topics could come out of all these reviews.

For example, the following reviews could be modeled in a topic called Magical Place.

  • "I can’t say enough about how much I love this magical place."
  • "A truly magical experience and value for money as long as you do your homework on where to eat, I would definitely go again!"
  • "Everything is just magical from the unique rides like the Ratatouille and Nemo’s Crush Coaster."

We will use two algorithms, LDA (Latent Dirichlet Allocation) and NMF (Non-negative matrix factorization) to undercover these topics.

These two algorithms work by grouping together the reviews based on the words they contain, and noticing correlations between them. We use two methods to compare the topics found and even if they differ, but they should bring similar outcomes.

Topics discovered by LDA

Topics discovered using LDA
Topics discovered using LDA
  • The first topic sounds to be about fun rides, popular ones such as Space Mountain and Big Thunder Mountain
  • The second topic is about queues and waiting times. Maybe correlated or due to closed rides.
  • The third topic **** seems to be about staff smoking and money.
  • The fourth topic describes a magical and amazing experience due to the parades and the Disney characters.
  • The fifth topic appears to be **** related to hotels and monetary value or also to fireworks and busy hours of the days.

It’s not always easy to find a clear interpretation of the topics, but at least it gives indications and maybe subjects to explore further.

Topics discovered by NMF

Topics discovered using NMF
Topics discovered using NMF

As for LDA, the outcome of this method is also in the shape of topics.

  • The first topic might be about children loving the parades and meeting the Disney characters.
  • The second topic is about the fun rides, similar to our LDA topics but with alternatives names (Crush Coaster, Tower of Terror, Ratatouille).
  • The third seems to be about Fast pass tickets and waiting times (between hours or minutes).
  • The fourth topic is related to Staff/Cast Members and people.
  • The fifth topic ** is** about hotels, restaurants, and trains.

Python tips

In this example, we are using the python Gensim librairy. The LDA function offers ready enough parameters to tune so we can achieve a good result. Here, the tip can be to convert the transformed set of reviews using Sparse2Corpus() function. Also, we use 2 workers to distribute the computation process.

What could be our next steps? 🐭

We’ve been analyzing reviews about the two Disneyland Paris parks, but it would also be interesting to see what reviews tell us about hotels.

Also, we only considered English reviews for our text analysis part. We could try to use different libraries to include other languages such as French, Spanish or Italian reviews.

Moreover, we could make an exhaustive analysis of text for different ratings or sentiments. For example, applying topic modeling methods against our positive and negative reviews separately.

And that’s it! I hope you enjoyed the article and learned more about text analysis possibilities, don’t hesitate to leave feedback or comments 🤓


Related Articles