The Naïve Bayes Classifier

Joseph Catanzarite

Published in

Towards Data Science

6 min readDec 14, 2018

Joseph Catanzarite

The Naïve Bayes Classifier is perhaps the simplest machine learning classifier to build, train, and predict with. This post will show how and why it works. Part 1 reveals that the much-celebrated Bayes Rule is just a simple statement about joint and conditional probabilities. But its blandness belies astonishing power, as we’ll see in Parts 2 and 3, where we assemble the machinery of the Naïve Bayes Classifier. Part 4 is a brief discussion, and Parts 5 and 6 list the advantages and disadvantages of the Naïve Bayes Classifier. Part 7 is a summary, and Part 8 lists a few references that I’ve found useful. Constructive comments, criticism, and suggestions are welcome!

Reverend Thomas Bayes

1. Prelude: Bayes’ Rule

Given two events A and B, the joint probability P(A,B) is the probability of A and B occurring together. It can be written in either of two ways:

The first way:

P(A,B) = P(A|B) * P(B)

Here P(A|B) is a conditional probability: the probability that A occurs, given that B has occurred. This says that the probability of A and B occurring together is (the probability that A occurs given that B has occurred) times (the probability that B has occurred).

The second way:

Create an account to read the full story.

The author made this story available to Medium members only.
If you’re new to Medium, create a new account to read this story on us.

Continue in app

Or, continue in mobile web

Already have an account? Sign in

Published in Towards Data Science

Last published just now

Your home for data science and AI. The world’s leading publication for data science, data analytics, data engineering, machine learning, and artificial intelligence professionals.

Written by Joseph Catanzarite

“The roads by which men arrive at their insights into celestial matters seem to me almost as worthy of wonder as those matters in themselves.” — Johannes Kepler

Responses (1)

What are your thoughts?

Also publish to my profile

Recommended from Medium

OpenCost: The Open-Source Tool You Need for Kubernetes Cost Management

Sandip Das

OpenCost: The Open-Source Tool You Need for Kubernetes Cost Management

It tracks and breaks down Kubernetes resource costs by workload, namespace, or service, enabling teams to monitor and optimize cloud…

Nov 14, 2024

Kubernetes

In

Stackademic

by

Crafting-Code

I Stopped Using Kubernetes. Our DevOps Team Is Happier Than Ever

Why Letting Go of Kubernetes Worked for Us

Nov 19, 2024

Lists

Predictive Modeling w/ Python

20 stories1811 saves

Practical Guides to Machine Learning

10 stories2185 saves

Coding & Development

11 stories990 saves

Natural Language Processing

1909 stories1568 saves

How I Am Using a Lifetime 100% Free Server

Harendra

How I Am Using a Lifetime 100% Free Server

Get a server with 24 GB RAM + 4 CPU + 200 GB Storage + Always Free

Oct 26, 2024

How To Install DeepSeek R-1 In Your Local PC And Use Locally

In

Generative AI

by

Jim Clyde Monge

How To Install And Use DeepSeek R-1 In Your Local PC

Here’s a step-by-step guide on how you can run DeepSeek R-1 on your local machine even without internet connection.

Jan 24

I used OpenAI’s o1 model to develop a trading strategy. It is DESTROYING the market

In

DataDrivenInvestor

by

Austin Starks

I used OpenAI’s o1 model to develop a trading strategy. It is DESTROYING the market

It literally took one try. I was shocked.

Sep 15, 2024

Kube-Pprometheus-Stack vs. k8s-Monitoring-Helm: Which One Should You Use for Kubernetes Monitoring?

Mr.PlanB

Kube-Pprometheus-Stack vs. k8s-Monitoring-Helm: Which One Should You Use for Kubernetes Monitoring?

Keeping tabs on what’s happening inside a Kubernetes cluster is no small feat. Between metrics, logs, and traces, monitoring can quickly…

4d ago

See more recommendations

Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams