Dionysos 2.0 🍇: Wine Recommender System and Interactive Tastespace

How to match a person’s tastes with wines they’ll potentially like (plus: the most advanced visualization of wines according to their taste)

Stefano Ioannucci
Towards Data Science

--

Dionysos, or Bacchus, is the greco-roman god of wine. He probably approves this work. Image made by my buddy Massimo de Antonis

Hello there. Today I will follow up on my previous adventure in the world of wine and what can be done with its data. Indeed, in my previous post’s closing remarks I wondered about how interesting it would be to have an individual taste profile for each bottle, which could then be matched to the taste preferences of a person. Turns out there’s no need to work at a wine-selling website to do this. By the way, it’s called a “recommender system”.

The style of today’s anecdote will differ. It won’t be a coding tutorial as there are plenty of those, made by more qualified teachers than me. Although it may be of some interest to tell you that everything here below can be done through Python, and there’s many free resources online to learn how to.

Anyhow, I’d like to talk about two things: the first, is how to match a person’s tastes to wines they’ll potentially like, which turns out to be a mind-numbingly simple process. The second, is an innovative, interactive, data-driven map of wines according to their taste, as judged by hundreds of thousands of users. To my knowledge, this is the first time that a map like this has been done, so I’m quite excited to present it.

But first things first: we need data about wines, specifically about individual bottles of wine to which people assign some tastes to. There isn’t such a dataset out there in the wild, so it has to be built from scratch. One way to do this is by coding a “webscraper” to scrape such data off a wine-selling website. Now this is technically not allowed, but there’s plenty of blog posts out there that openly say they’ve done it, so count me in too. Also, in a romantic way it could be seen as a modern twist on the Robin Hood legend, where the websites that extract data from us everyday are themselves the victims of their scheme. Immodest literary comparisons aside, I’m not earning anything out of this, so it should be alright.

Thus, we magically have a dataset of wines with their taste features estimated, which is pretty neat. Obviously, to mitigate noise in the data, we’ll pick some threshold to include only wines that have a certain number of user votes. Then, we’ll need the taste profile of a user, which consists in all the ratings he or she has left on bottles of wines, which have their own taste features. Luckily for us, this too is easily and freely accessible. I randomly picked a user as case study, since he happens to like my favorite bottle of wine, thereby clearly having good taste. We’ll call him “Em”.

Em likes a lot of different wines, the taste features of which can differ considerably, so we’ll narrow down the model on his favorite type: Bordeaux Reds. The next step will consist in running a linear regression (that is “a machine learning algorithm” in marketing speak) on his user data, and then generalize our model to the rest of the bottles in the wine data.

Let’s walk through the process:

In the first place, there are way too many taste features assigned to the wines, so we’ll need to compress them without losing signal. These features are also highly collinear, which essentially means they’re correlated and that’s bad for the model. As a simple example, if we consider flavors such as “coffee”, “mocha” and “espresso”, then it makes sense that there will be votes from various users scattered across all these 3 features that actually come from the same origin: a coffee-like taste. The sole thought of dealing manually with this issue is daunting. Luckily, there are better options; one of these is Principal Component Analysis (PCA).

What PCA does, in a very non technical explanation, is “squashing” the features in the dataset, while preserving their descriptiveness. Exactly what we need! Thanks to PCA, we can go from over 200 taste features to just 20, so it is actually discovering new, prototypical, tastes along the way. The drawback is that these compressed features aren’t interpretable as the original ones, however that isn’t an issue here and you’ll see why in a moment.

We apply PCA to all the data together, that is the user, or “train”, dataset and the wines, or “test”, dataset. Otherwise, the compressed features wouldn’t match between datasets. Then, we can run the linear regression, or “hypothesis”, on the user data, with the rating score as dependent variable. What this does is fit a line between the user’s ratings and the compressed taste features. By doing so, it “learns” what numbers to assign to the coefficients that get multiplied to the values of the taste features to determine the ratings. If that wasn’t 100% clear, and I have a feeling it might not be, I’ll try to make it so with the good ol’ linear regression formula:

Hope this clarifies what stated above. It doesn’t help that the coefficients are also called betas (the symbol used here) or thetas, depending on who you ask. Maths are about precision and consistency, right? Image by author

Anyhow, as you can see, these betas/thetas/coefficients we obtained can be multiplied with the values in the corresponding taste features of any wine, giving Em’s predicted rating for that bottle.

In other words, they represent Em’s combination of unique taste preferences for Bordeaux Reds. I mentioned it was going to be easy, didn’t I?

Obviously, this model should be applied to wines that are coherent with the training set. I did that for 307 bottles of Bordeaux MĂ©doc variety (which is Em’s most rated). Here below you can scroll and see the values for all the taste features of Em’s potentially liked and disliked wines (i.e. their predicted rating is above or below some thresholds). The names and prices of the wines have been purposely omitted.

We’re looking at the original data before PCA compression, therefore the taste features still have meaning. Indeed, the results seem to suggest that Em would prefer earthy and smoky wines with blackcurrant and red fruit hints and Petit Verdot grape, while tending to dislike those with a strong chocolate and leathery taste with hints of tomato.

Of course the only way to actually test this would be to have Em taste these wines and let us know what he thinks. As I don’t know him personally, I’m keeping track of which wines I like and dislike to eventually experiment on myself. If you have this type of data and want to try let me know!

Recently, one of the biggest wine-selling websites introduced such a taste-wine matching service for their users. I’m willing to bet it doesn’t work too differently from what I presented here.

P.S. Machine learning aficionados will notice I skipped the “cross-validation” step, where I would have avoided using part of the wines in the user data to train the algorithm, using this subset to instead assess the validity of the learnt betas, by comparing the predicted rating versus the actual rating. However, given the scope of this post, this step was glossed over. While in a more rigorous approach, it would have been fundamental: if we predicted poorly the ratings, then some corrections would have been warranted, such as considering another learning algorithm. The general principle stays the same though!

Now that we dealt with that,

let us proceed to the second part of the post. It is linked to the previous part in the sense that we’ll apply an evolved version of PCA to all the wines in the dataset. Think of this as a PCA on steroids which can be used to compress the data, while representing the various relationships between the entities as distances on a 2 (or 3) dimensional map. In this case, I am using UMAP, and if you’re drawn by this sort of algorithm, I suggest you look into the work of Alex Telea, who is doing some rather interesting things on the topic.

Anyways, through this algorithm I made a meaningful, explorable, map of wines arranged according to their taste, as judged by thousands of users. Or, to be more specific, a map of 3834 wines, from 110 different specialties, arranged according to their scores in 169 different tastes, as voted by approximately 492.516 unique people.

Not to beef with any sommelier and their rigorous study of the subject, but this half a million human hive-mind is likely a more reliable estimator of the taste of wines than the brief descriptions generally found on bottles.

Let me present you the interactive plot where you can explore this map. As far as I know you’ll be among the first to have the occasion to do so. It isn’t the same as exploring a new continent, but it’s still quite fascinating, no?

With some guidance from a friend knowledgeable in JavaScript, I included buttons that highlight the wines linked to the corresponding taste(s). These tastes could be assigned on a bottle-by-bottle level, or to entire varieties. I opted for the latter, as it gives more insight on why the algorithm chose to arrange the wines in the 2-dimensional space as it did, although the only informations it used were the individual bottles’ taste scores.

Unfortunately, I can’t embed directly the interactive plot here, so you’ll have to go to this link: https://dionysus-stempio.netlify.app/. In the meantime, here’s a GIF preview:

You can zoom in and out, click and drag to explore the map. Each point corresponds to a bottle of wine and it’s colored based on its geographical origin (legend on bottom). When you click on a taste button, the names of the wine specialties that have the selected taste(s) will appear on the right corner, and their corresponding dots will remain colored, while those who don’t will become transparent. Hovering over the individual wines will show their grapes and specialties. Gif by author

Once again, names and prices of wines were purposely omitted. Although I briefly tinkered with the idea of turning the plot into a tool where one could see inexpensive alternatives to high luxury wines, supposed to be judged similarly in terms of taste. However, in the end I decided not to. Humans are notoriously bad at separating the price payed for a wine from its objective taste (proof here). Besides, this is true also for other luxury goods (did someone say Apple or Gucci?).

I think it may be a neat idea to have a more visually embellished version of this map, perhaps even in 3 dimensions, at a museum like Bordeaux’s CitĂ© du Vin for visitors to play around with. I proposed this idea to them, but never heard back ÂŻ\_(ツ)_/ÂŻ

This concludes the series on wine and its data, unless I get some new inspiration! Albeit few things are motivating as good wine, I’ll be looking out for new serendipities in the age of data.

Thanks for reading!

bonus colorful figure in case you were wondering which are the most appreciated wine specialties :

Only specialties with at least 40 bottles rated are included. Image by author

--

--

Data enthusiast human working in neuroscience. Originally from Rome, with a sprinkle of La Habana.