A Bayesian Way of Choosing a Restaurant

Kirill Tsyganov
Towards Data Science
3 min readOct 12, 2023

--

Recently I was looking for a new good restaurant. Google Maps highlighted me 2 options: restaurant A with 10 reviews all 5 stars and restaurant B with 200 reviews and average rating 4. I tempted to choose restaurant A but the low number of reviews concerned me. On the other hand, many reviews of restaurant B gave me confidence of its 4 star rating, but promised nothing excellent. So, I wanted to compare the restaurants and choose the best discounting on reviews or lack of reviews. Thanks to Bayes, there is a way.

Image made by the author.

Bayesian framework allows to assume something about the initial distribution of ratings and then update the inital belief based on observed data.

Set initial beliefs / prior

  • Initially we know nothing about probabilities of each rating (from 1 to 5 — stars). So, before any reviews, all ratings are equally likely. It means we start from the Uniform distr. which can be expressed as a Dirichlete distribution (generalization of Beta).
  • Our average rating will be just (1+2+3+4+5)/5 = 3 which is where the most probability is concentrated.
# prior prob. estimates sampling from uniform
sample_size = 10000
p_a = np.random.dirichlet(np.ones(5), size=sample_size)
p_b = np.random.dirichlet(np.ones(5), size=sample_size)

# prior ratings' means based on sampled probs
ratings_support = np.array([1, 2, 3, 4, 5])
prior_reviews_mean_a = np.dot(p_a, ratings_support)
prior_reviews_mean_b = np.dot(p_b, ratings_support)
Image made by the author.

Update beliefs

  • To update the initial beliefs we need to multiply the prior beliefs to the likelihood of observing the data with the prior beliefs.
  • The observed data is naturally described by Multinomial distribution (generalization of Binomial).
  • It turns out that Dirichlet is a conjugate prior to the Multinomial likelihood. In other words our posterior distr. is also a Dirichlet distributuion with parameters incorporating observed data.
Image made by the author.
# observed data
reviews_a = np.array([0, 0, 0, 0, 10])
reviews_b= np.array([21, 5, 10, 79, 85])

# posterior estimates of ratings probabilities based on observed
sample_size = 10000
p_a = np.random.dirichlet(reviews_a+1, size=sample_size)
p_b = np.random.dirichlet(reviews_b+1, size=sample_size)

# calculate posterior ratings' means
posterior_reviews_mean_a = np.dot(p_a, ratings_support)
posterior_reviews_mean_b = np.dot(p_b, ratings_support)
  • The posterior avg. rating of A is now somewhere in the middle between prior 3 and observed 5. But the avg. rating of B didn’t change much because the large number of reviews outweighted the initial beliefs.
Image made by the author.

So, which one is better?

  • Back to our original question, “better” means the probability that an avg. rating of A is bigger than an avg. rating of B, i.e., P(E(A|data)>E(B|data)).
  • In my case I obtain the probability of 85% that restaurant A is better than restaurant B.
# P(E(A)-E(B)>0)
posterior_rating_diff = posterior_reviews_mean_a-posterior_reviews_mean_b
p_posterior_better = sum(posterior_rating_diff>0)/len(posterior_rating_diff)
Image made by the author.

Bayesian update allows us to incorporate prior beliefs which are especially valuable in case of small number of reviews. However, when the number of reviews is big, the initial beliefs do not significantly impact the posterior beliefs.

Code is available in my github and I am going to the restaurant A.

--

--