The world’s leading publication for data science, AI, and ML professionals.

Magic 8 Ball: An App to Maximize Wins in Competitive Pool Matches (Part 3)

How machine learning and SQL can help you outsmart your opponent in sequential team games

Welcome back to the final part of this three-part series on data driven pool strategy!

In Part 1, we explored the quirks of competitive team pool, and determined that player lineup selections for pool games could be improved with data science. We developed a classification model that predicts the probability that one player will beat another player based on past results.

In Part 2, we went one step further and implemented a maximin player selection algorithm in SQL which uses the model predictions to calculate optimal player selection choices.

In this final part, we will compare our machine learning-powered player selections to real world player selection strategies and determine if we really can consistently gain a competitive edge over our opponent.

Part 1: Introduction and predictive modelling.

Part 2: Strategising with SQL.

Part 3: ‘Potting’ it to the test!


Comparison to real strategies

You might be wondering if team captains actually put much thought into their player lineups, or if they are comfortable with letting the players choose their playing order amongst themselves. The answer is, some don’t, and believe that it is futile to fight against the league’s equalizing handicap. However we saw in Part 1 that despite the handicap, we can still use the Skill Margin, Race Margin, and Win % Margin associated with a player pairing to accurately predict the probability that one player will win the round.

A commonly used real life strategy is attempting to match the skill levels of your players with similarly skilled players on the opposing team. The logic behind this method is that every player has confidence that they can beat their equally skilled counterpart.

So, let’s compare this real life skill-matching strategy with the maximin selection algorithms developed in Part 2, i.e. our app recommendation. We can compare performance between the two strategies over 1000 simulations to see which bears out.

For each simulation, two teams of five players with randomly distributed parameters are generated using the numpy.random.rand() function.

Player selections for Team A are performed in three ways:

  • Strategy 1: Selecting the player with the closest skill match to the selected team B player.
  • Strategy 2: Using the maximin app recommendations.
  • Making random choices.

Player selections for Team B are always random.

This results in three different final lineups for each simulated match. The associated match winning probability is recorded for each of these lineups, along with the projected score tally calculated according to NAPA rules.

We can then quantify the effectiveness of each strategy using the probability difference, defined as:

i.e. The match winning probability of the final lineup when using strategy 1 or 2 minus the match winning probability of the final lineup when making random choices. In Figure 1(a), the probability difference is plotted for the 1000 simulated matches. Red shading corresponds to _Pdiff when following strategy 1 (the similar skills strategy), while blue shading corresponds to _Pdiff for strategy 2 (the app strategy).

Figure 1(a): Pdiff when using app recommendations (blue) and when using similar skills strategy (red). (b). Predicted score difference between the average and using alternative strategies. Image by author.
Figure 1(a): Pdiff when using app recommendations (blue) and when using similar skills strategy (red). (b). Predicted score difference between the average and using alternative strategies. Image by author.

It can be seen that the red distribution is centred on a mean P_diff of 0, meaning that averaged over many matches, the similar skills strategy does not result in an improvement over making random choices. This makes sense, given that in Part 1 of this series we showed that the skill level is only one factor in determining a player’s winning probability.

On the other hand, the mean value of P_diff when using the app recommendations is 0.054, and P_diff is greater than 0 in 98.4% of the simulated matches!

The predicted score difference in Figure 1(b) follows a normal distribution, so we can calculate the one-sided p-value for a t-test __ between the means of the predicted score differences for the app recommendations method and the similar skills method. The calculated value is well below the chosen significance level of 𝛼=0.05, so we can conclude that app recommendations yield better predicted scores than using the similar skills strategy.

Could we do even better?

We have shown that using the app recommendations leads to match winning probabilities that are on average 0.054 higher than making random choices or using the similar skills strategy. Is this the best we can do? Are we always getting the best possible lineups or could we boost our chances even more?

Figure 2: Final lineups ranked against all permutations when making random choices (red) and using app recommendations (blue). Image by author.
Figure 2: Final lineups ranked against all permutations when making random choices (red) and using app recommendations (blue). Image by author.

It is insightful to think in terms of the permutation ranking, where the top ranking permutation (rank 1) is the lineup which has the highest match winning probability for Team A, and the lowest ranking permutation (rank 120) has the lowest match winning probability for Team A.

Figure 2 shows a histogram of the permutation rank for each lineup obtained from the 1000 simulations, with app-recommended lineups in blue and random choice lineups in red. It is clear that when using the app recommendations we are very likely to end up with one of the best possible lineups, and the chances of getting a worse lineup decrease exponentially. Only on extremely rare occasions does Team B get lucky with their random selections and we end up with a lineup worse than the average of 60. On the other hand, the uniform nature of the red distribution shows that when we also make random choices we are equally likely to end up with any ranking.


What if the opponent is a tactical genius?

So we’ve confirmed that our app strategy outperforms a naïve random opponent, but how will it fare against a captain with a bit more tactical acumen? Let’s consider the worst case scenario – giving the opponent access to the app recommendations as well. To test out this scenario, 1000 new matches were simulated with player selections determined by the maximin strategy for both teams. Like before, the same matches were repeated using random player selections for both teams so that P_diff could be calculated for each match. The simulation was run twice, once with Team A making the first pick, and once with Team B making the first pick. The P_diff distribution is shown in Figure 4, where the results when Team A picks first and second are plotted in light blue and dark blue, respectively.

Figure 3: Probability difference from 1000 simulated matches when both team captains have access to the app recommendations. Light blue indicates Team A picking first, dark blue for Team A picking second. Image by author.
Figure 3: Probability difference from 1000 simulated matches when both team captains have access to the app recommendations. Light blue indicates Team A picking first, dark blue for Team A picking second. Image by author.

It is evident from the distributions that there is a slight advantage to whichever captain selects first in round one! [Note that this advantage does not appear when one of the team captains makes random choices, in that case both distributions resemble the red distribution of Figure 1(a)].

When the Team A captain picked first, P_diff was greater than 0 in 71.8% of the simulated matches, with a mean P_diff of 0.012.

When they picked second, P_diff was greater than 0 in only 28.6% of the simulated matches, with a mean P_diff of -0.012.

This result is statistically significant, with p<0.0001 for a Wilcoxon signed-rank test.

We can therefore conclude the surprising result that, all strategies being equal, over many matches we can expect a small positive increase in match winning probability simply by electing to select a player first after winning the coin toss.

It is worth noting that from personal experience, this is counter to common belief among pool captains, who are often unwilling to make the first move!


Front-end and app deployment

The final step in the app development was to design a front-end interface for Team Captains to use. There are many fantastic guides for setting up web applications using Flask in communication with a postgreSQL database, so I won’t delve into the details here. A useful step-by-step guide to deploying a web app on Heroku can be found here.

Instead, feel free to play around with the final product at http://magic8billiards.herokuapp.com/. Instructions for use can be found under the ‘How to use’ tab on the website.

Figure 4: The app visual assistant. Each dot represents the match winning probability of a single lineup permutation, with dark red dots indicating lineups which are still possible. As more rounds are completed, most lineups become impossible and are greyed out. The final dark red dot is the final lineup. Image by author.
Figure 4: The app visual assistant. Each dot represents the match winning probability of a single lineup permutation, with dark red dots indicating lineups which are still possible. As more rounds are completed, most lineups become impossible and are greyed out. The final dark red dot is the final lineup. Image by author.

A key element of the app is the visual assistant: a dynamic swarmplot which plots the match winning probabilities for all 120 possible lineups as red dots. As rounds are completed, some lineups become impossible and these dots are greyed out. As we have shown above, the app guides team captains towards lineups with the highest match winning probability. By round 5 there is only one red dot remaining, representing the final selected lineup. Unless the captain is especially unlucky, this dot should be well towards the right side of the swarm, indicating a boosted chance of victory!

Wrapping up

Congratulations on the end of this three-part deep dive into the idiosyncrasies of team pool strategy!

After developing a probabilistic model in Part 1, in Part 2 we designed SQL queries that identify optimal player selections in each round of a match. Finally, in Part 3 we evaluated the performance of these queries over many simulated matches.

The key takeaways are:

  • Logistic regression can reliably predict the probability that one player will triumph over another in the NAPA pool league, using skill level, previous win percentage, and race length, as input features.
  • A maxmin player selection strategy results in a mean improvement in match winning probability over random choices of 5.4%.
  • The team captain who wins the coin toss should elect to pick first in round 1 – this boosts match winning probability by 1.2% even in the worst case scenario.

The selection algorithms discussed here have applications which stretch beyond the realms of amateur pool. Some industry scenarios where similar type of problem arises include:

  • Two companies competing to win the most contracts from a list of available projects.
  • Two airlines bidding to win landing slots at an airport.

I hope that this series has inspired you to pick up a pool cue and check out your local league. Playing on a pool team is a great way to meet new friends, and the handicap system makes it an enjoyable experience for players of all abilities.

Image via instagram.com/stevenfritters under license to Callum O'Donnell.
Image via instagram.com/stevenfritters under license to Callum O’Donnell.

Related Articles