Photo by Jens Lelie on Unsplash

Bayesian Statistics 101

Love it or hate it, you will never look at statistics the same way again

Arthur Mello
Towards Data Science
13 min readOct 18, 2021

--

Introduction

Bayesian statistics differs from classical statistics (also known as frequentist) basically in its interpretation of probability. The former sees it as a “degree of belief”, whereas the latter sees it as the “relative frequency observed during many trials”.

This difference might seem too abstract, but it has great practical impact on the methods developed afterwards.

The frequentist view is the most common one and, therefore, has influenced a greater number of statistical techniques. However, many modern methods rely on the Bayesian approach and can yield great results, providing you with alternatives when there isn’t much data to work with.

The methods we’ll see now will give you a formal framework through which you can add subjective judgment to your data science problems, which can be specially helpful when you don’t have much data available, or when you know that data is flawed somehow. It will also help you understand the reasoning that originated some famous machine learning algorithms, such as Naive Bayes Classifier and Bayesian Neural Networks.

We’ll start by taking a quick look at the the Bayes’ theorem — the core of Bayesian statistics — and then move on to some of the techniques that stem from that and how they can be used to solve all sorts of statistical problems. We’ll be using Python, by the way.

Bayes’ Theorem

The equation above is quite simple, but understanding it requires knowing some notations from probability theory:

  • P(A): probability of an event A
  • P(A|B): probability of an event A given that an event B occurred

An event can be pretty much anything. For instance, P(A|B) can mean “the probability (P) of having COVID (A) given that (|) your PCR test came positive (B)”. To calculate that probability using the above equation, we would need:

  • P(A): probability of having COVID (regardless of test results)
  • P(B): probability of having a positive result in a PCR test (regardless of whether you have COVID or not)

--

--

Data scientist and educator. I write about data analysis and machine learning applied to marketing. New time series course: https://shorturl.at/fivw5