Old Principles, New Approaches: Bayes in Practice
In a discipline as innovation-focused as data science, approaches that were cutting-edge just a couple of years ago can feel stale today. It makes it all the more remarkable that Bayesian statistics—a set of principles almost three centuries old — have enjoyed such a long shelf life.
Bayes’ Theorem and its derivative applications aren’t something you learn about in a college stats course, only to be promptly filed away in the far periphery of your memory. Every day, data science and machine learning practitioners put these concepts to good use—and find new ways to leverage them in their projects.
This week, we look at several contemporary use cases that showcase the staying power of Bayesian methods. Let’s dive in.
- A/B testing with a Bayesian twist. Hannah Roos’s excellent deep dive provides a clear explanation of the differences between Bayesian and frequentist statistics, and shows how to conduct A/B tests with each approach. It then benchmarks their respective performance on a real-world example: measuring engagement on social media content.
- How to make your model work better with Bayesian optimization. Hyperparameter tuning is a key step in training a machine learning algorithm and minimizing its loss function. Carmen Adriana Martinez Barbosa unpacks how Bayesian optimization improves on previous methods, and walks us through its implementation in Python with the Mango package.
- Give your classification tasks a Bayesian boost. In his new explainer, Michał Oleszak covers the basics of naive Bayes classifier algorithms (if you’re new to this topic, this is a great place to start!). He goes on to suggest that, in some contexts, removing the algorithm’s naive independence assumption can help your model’s accuracy.
- A fresh look at ranking problems. Part stats walkthrough, part hands-on tutorial, Dr. Robert Kübler’s article demonstrates how to build a model that lets you rank a set of players (all the Python code you’ll need is included), and also clarifies why the integration of prior beliefs—a core aspect of Bayesian techniques—leads to more robust rankings.
While many of us can geek out over Bayes for days, you might also be up for some great reading on other topics. Here are a few of our recent favorites:
- Can you use one machine learning model to augment another? Ria Cheruvu makes the case for composite AI systems.
- Erin Wilson’s new post makes a complex workflow accessible for beginners: learn how you can model DNA sequences with PyTorch.
- Derrick Mwiti offers a thorough introduction to the TensorFlow 2 Object Detection API for anyone who’d like to use it for image segmentation (and, of course, for object detection).
- New online book alert: we’re thrilled to share the first chapter from Mathias Grønne’s extensive introduction to autoencoders.
- No-installation interactive Python apps?! Yes, you too can build them by following Sam Minot’s TDS debut, a useful, Streamlit-based tutorial.
- Learn how to serve ML models with Apache Spark — Pınar Ersoy shares a patient, end-to-end guide.
- Don’t miss Lynn Kwong latest contribution, which focuses on different ways to insert large numbers of records into your database efficiently.
We love sharing great data science work with you, and your support — including your Medium membership — makes it possible. Thank you!
Until the next Variable,
TDS Editors