Bayesian Ranking System

Ranking with varying numbers of responses

Published in

Towards Data Science

3 min readJul 5, 2019

Note: Assumes familiarity with the beta distribution covered earlier.

Beyond calculating lottery probabilities or disease likelihoods there are also other applications for Bayes theorem, for example we could build a ranking system. Let’s take a movie ranking website where users vote up/down on movies. Simple ranking schemes like percentage of positive votes or up minus down votes perform poorly.

Percentage: 60 up : 40 down — vs — 6 up : 4 down are both 60%

up minus down: 100 up : 95 down vs 5 up : 0 down are both +5

What we would like is for more votes to add more information; 60 votes hold more weight than 6 votes. Let’s use the votes as likelihoods in Bayesian inference. Here’s a set of movies A-E with up/down votes and the calculated beta function starting from an uniform beta(1,1) Prior:

However the beta distribution is a PDF so we’ll need some algorithm to convert to a scalar for ranking. One way is to find the minimum value of the beta distribution such that we are 95% confident the true value is greater. This can be done by subtracting a number of standard deviations from the mean.

Ranking = Mean + z-score × standard deviation

For a normal approximation the cumulative density function CDF is 5% at a z-score of -1.64 which you can use in the formula above. However if you have access to the inverse CDF (aka percent point function) of the beta distribution itself you can use it directly:

rank = beta.ppf(0.05,a,b) # python
rank = BETA.INV(0.05,a,b) # Excel, Sheets
rank = qbeta(0.05,a,a) # R

Yielding the following results:

B   6:1  rank: 0.53
A  60:40 rank: 0.52
C   6:4  rank: 0.35
E  10:20 rank: 0.21
D   1:2  rank: 0.10

Beta distributions of five ranked movies A,B,C,D,E

This separates the high evidence A (60:40) and low evidence C (6:4) movies but isn’t an ideal ranking overall, particularly for D (red) which has very little evidence yet receives the worst ranking. We know that average movies are more common than extremely good or extremely bad movies. We’d prefer a ranking which started with an assumption of average movies and required evidence to move to the extremes. We can incorporate this preference with a Prior in our ranking system having a bias towards average movies, starting with a Prior of Beta(11,11) rather than Beta(1,1). It will now take a number of votes to move away from the Prior and towards the extremes, yielding a better overall ranking:

A  60:40 rank: 0.51
B   6:1  rank: 0.43
C   6:4  rank: 0.39
D   1:2  rank: 0.32
E  10:20 rank: 0.30

Ranking with a prior biased towards an assumption of average movies

Giving us a final result showing that the low evidence D curve is roughly in the middle while the E curve with significantly more negative evidence now has the lowest rank.

This doesn’t only work with a up/down rating, you can extend it to work with a star based system by assigning values into simultaneous up/down votes.

Stars (1,2,3,4,5): up (0,0.25,0.5,0.75,1) down (1,0.75,0.5,0.25,0)

so if three people each rated a movie 4 stars that’s a total score of:

3  up  votes each of 0.75 value = 3×0.75 = 2.25
3 down votes each of 0.25 value = 3×0.25 = 0.75
Beta(3.25,1.75) # Uniform prior
Beta(13.25, 11.75) # Prior biased toward average movie assumption

We’ve completed a stable ranking system which rewards increasing amounts of evidence and shown how it can be extended to work with star rating systems all thanks to the Reverend Bayes.

If you want to try it for yourself the python code below does the ranking and draws the graphs for the biased case.

Bayesian Ranking System

Ranking with varying numbers of responses

Written by R Andrew Cocks