I paid a visit to this museum about four months ago on a tour of Wellington, the windiest capital of the world. The Māori name, ‘Te Papa Tongarewa‘, translates literally to ‘container of treasures’ and it began in 1865 as a tiny Colonial Museum.
The building is magnificent as well as the exhibitions in there. Two places I loved were the Marae, and the section documenting migrant journeys to New Zealand. As an aspiring data scientist, I had one goal after leaving the place. I wanted to find out whether the positive feeling I had was a replica of what others felt or missed out. My first stop was a quick look at online reviews. I chose TripAdvisor reviews based on their conciseness and high level of expressiveness . I got Te Papa’s Trip Advisor reviews from 2016–04–01 to 2019–07–15. A little statistics and a sample of the dataset is as below.


A few analyses regarding the dataset:
The distribution review text lengths and word count
Text Analyses
We cleaned up or data by following the below processes: –
- Drop the hotel name, review title, helpful vote, usernames.
- Give the ratings out of 5 not 50. Simply divide by 10.
- Remove the rows where "review_body" is missing.
- Create new feature for the length of the review.
- Create new feature for the word count of the review.
-
Use TextBlob to calculate sentiment polarity. Values range from -1 to 1 where 1 is the most positive and -1, the most negative sentiment.
The output of the above process is as below:-

TextBlob Operations
TextBlob a Python (2 and 3) library provides a simple API for diving into common natural language processing (NLP) tasks such as part-of-speech tagging, noun phrase extraction, sentiment analysis, classification, translation etc. The below code snippet generates 5 random tweets with the most positive polarity in the dataframe. The same can be done for the most negative one i.e. polarity = =-0.875 and polarity == 0 for neutral tweets in our case.

I wasn’t satisfied with the performance of TextBlob library especially in pinpointing out negative sentiments in a dataset that is heavily skewed to towards positive polarity as shown below.

Sentiments with VADER
VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon and rule-based sentiment analysis tool. It uses a combination of a sentiment lexicon which is a list of lexical features (e.g., words) that are generally labelled according to their semantic orientation as either positive or negative.
I took up VADER to analyze sentiments as much as its more accustomed to social data as compared to conventional reviews.

The most negative review as per VADER is below.
We went to Te Papa to see the Gallipolli, the scale of war exhibition. This exhibit was great but terribly sad at the same time, as it retold the story about the futility of war and the terrible loss of human life. The oversized models created by the Weta workshop were terribly accurate.
Words like "terribly" and "sad" must have played a key role in the negation process. As much as the rest of the sentiments more so the positives are overly fine, it calls for a sharper eye to catch such inconsistencies. All in all, expecting 100% accuracy in extracted statements is an impossibility. Only way out is to train our own model as much as it may not be overly accurate. We’ll do that later on.
Sentiments Over Time
Events influence people’s feelings. Launch of certain exhibitions are bound to elicit mixed reactions from the visitors. Plotting sentiments as a time-series will give us a better idea of patterns in visitors’s feelings. Before plotting, a few changes are made to the data frame . We sort all values by review time copying the review time to the index and with a 24hr interval between reviews to make graphing easier, and calculating an expanding and rolling mean for compound sentiment scores.

A few interesting points from the plot above: –
- Lots of sentiment points are concentrated around 0.75 sentiment score. This is a strong indicator that most of the reviews were overly positive.
- The are lots of data points. Hard to interpret patterns.
A sample representation of the entire dataset might give a more interpretable graph. Output below corroborates with the earlier assertion of strong positivity in the experience of Te Papa visitors.

Conclusion
Te Papa has a strong footing when it comes to visitor’s experiences from the above analyses. We made use of two sentiment analyzers that all proved versatile as much as negativity detection in sentiments was tough for both. This is the first step in a series of posts about Te Papa’s statistical and Data Science outlook. I look forward to making use of Te Papa’s publicly available datasets in working out the below: –
- Training a sentiment classifier based on a reinforced low vector representation with more data from their Google reviews. Hopefully I’ll dig in their social data too.
- Aspect based sentiment analysis. What aspects are Te Papa Visitors ecstatic about and what dissatisfies them?
- A comparative head to head analysis with some of the other top 25 Museums‘ in the world as ranked by travelers. What aspects make them do better compared to Te Papa?
- Analysis of whether certain events in Wellington or at Te Papa itself affect visitor experiences.
Stay tuned!