Probability theory isn’t hard, well at least on a level needed to get started in data science. It might have been a while since your last exposure to the topic, and if you feel rusty, this article might just be a way to get back on track.

If you were to just start in Data Science, a quick Google search would uncover 4 main mathematical topics the whole area is based on:
- Linear Algebra
- Calculus
- Statistics
- Probability
Recently I’ve covered Linear Algebra and Calculus, so feel free to read those articles too if you’re rusty on the topic. Today, however, I want to cover two crucial concepts from the probability theory: combinations and permutations.
Let’s begin with a basic definition of the probability itself:
Probability is a measure quantifying the likelihood that events will occur. Probability quantifies as a number between 0 and 1, where, roughly speaking, 0 indicates impossibility and 1 indicates certainty. The higher the probability of an event, the more likely it is that the event will occur.[1]
The whole field of probability is important because uncertainty and randomness occur in pretty much every aspect of your life, hence having a good knowledge of probability will help you to make more informed decisions, and also to make sense of uncertainties.
Before diving into permutations and combinations there is one important term that needs to be discussed – factorial.
What’s a Factorial?
Good question. According to Wikipedia:
Factorial of a positive integer n, denoted by n! is the product of all positive integers less than or equal to n[2]
You can calculate factorials with the following formula:

And here’s a quick hands-on example:

Now you might be wondering how you would go about calculating factorials in Python. While I’m certain in the existence of out-of-the-box functions in various libraries, it’s really easy to define your own function, and that’s exactly what we’ll do.
Here’s a simple recursive function which will get the job done:

And now we can use this function to verify the example from above:

Okay, that’s great, but when would you use factorials in the real world?
Let’s say that there are 5 persons in some race, and you want to find out how many ways can those 5 persons come first, second, and third. You could grab a sheet of paper and write down every possible outcome, but why? What if there were 100 persons?
Here’s how to solve the previous task with the use of factorials:

And in nutshell, this is called a permutation.
Permutations
Let’s once again start with a definition, shall we:
In mathematics, permutation is the act of arranging the members of a set into a sequence or order, or, if the set is already ordered, rearranging (reordering) its elements.[3]
There are two main ways to calculate permutations and it will differ whether you allow repetition or not. Let’s work this out with an example.
You have a website on which users can register. They need to provide a password that needs to be exactly 8 characters long, and characters cannot repeat. We first need to determine how many characters and digits there are in the English alphabet:
- the number of letters: 26
- the number of digits: 10
Which is 36 in total. So n = 36. r would then be 8, because the password needs to be 8 characters long. Once we know that, it’s easy to calculate the number of unique passwords, given the following formula:

If you went ahead and calculated by hand:

Or even in Python, it really is a trivial task:

Okay, cool, but I want to allow my users to repeat characters. No problem, in that case, we’re talking about permutations with repetition, and the formula is even simpler:

You already know what n is (36), and what r is (8), so here’s the solution:

Once again, implementation in Python is trivial:

That’s a whole lot of password options. Go ahead and read this number out loud, I dare you.
Combinations
Up next on the daily agenda are combinations. You might be wondering what are those, and how they differ from permutations. Let’s take it step by step. To start out, here’s a definition:
A combination is a selection of items from a collection, such that (unlike permutations) the order of selection does not matter[4]
To drive the point home, consider the following sentence: A group of people selected for a team is the same group, the order doesn’t matter. That’s the whole idea behind combinations. If you select 5 members for the team, you could order them by name, height, or something else, but essentially you would still have the same team – ordering is irrelevant.
So, let’s formalize this idea with a formula. The number of combinations C of a set of n objects taken r at a time is calculated as follows:

Now you can take that equation and solve the following task: On how many ways can you choose 5 people from a group of 10 for a football team?
The group would be the same, no matter the ordering. So let’s see, n would be equal to 10, and r would be 5:

This can once again be easily done with Python:

Great! But now you might be wondering if there exists a version of combinations which allows repetition. The answer is yes. I’ll explain now.
Imagine that you’re making a sandwich and for some reason, you’re only allowed to use 4 ingredients out of 10 possible. However, the ingredients don’t have to be unique, for example, you could put cheese 3 times and salami once. It’s perfectly fine, heck, I’m also a cheese person, so kudos to you.
But how would you formalize this idea and express it in a mathematical way? The answer is once again pretty simple:

Let’s use the formula to work out the example from above. n would once again be 10 (because there are 10 different ingredients), and r would be 4 (because you can only choose 4):

And once again, you can use Python for verification:

Neat. And that’s about enough for one article.
Before you go
While combinations and permutations are mathematically simple, the trick lies in representing real-world problems in this manner. To state with other words, it can sometimes be tricky to extract n and r in your everyday life.
While I can’t help you with that part, I’m hoping that this article gave you an idea of what you can do with both n and r, once they are obtained.
Thanks for reading, and as always, don’t hesitate to leave your thoughts in the comment section.
Loved the article? Become a Medium member to continue learning without limits. I’ll receive a portion of your membership fee if you use the following link, with no extra cost to you.
Resources
[1] https://en.wikipedia.org/wiki/Probability
[2] https://en.wikipedia.org/wiki/Factorial