
Machine Learning ecosystem covers a broad range of algorithms. Some of the these algorithms are highly complex whereas some are relatively simple. The performance of an algorithm is not necessarily proportional to its complexity.
Logistic regression is one of the simpler ones. However, it is very efficient and used in many applications such as spam detection, churn prediction, and so on.
Logistic regression is a supervised learning algorithm that is mainly used for binary classification tasks. In this article, I will share 3 pieces of key information that I think are important to comprehend how logistic regression algorithm works.
Log odds
Log odds is best explained through probability concepts. Probability of an event is a measure of the likelihood of that event.
Consider the event of an email being spam. If the probability of an email being spam is 0.9, then the probability of this email not being spam is 0.1. The term odds relates these two values.

In our case, odds is 9. The higher the value of odds, the more likely the event is to occur.
Log odds is the logarithm of the odds. Please take a look at the following table to see why we care about the log odds for the logistic regression algorithm.

Let’s assume I show you the probabilities of an email being spam and not spam. Then I ask you to predict if the email is spam. The obvious answer is spam if the probability of being spam is higher than 0.5.
The log odds of the probability value of 0.5 is 0. It is an important relationship for the logistic regression algorithm as we will see in the following part of the article.
Sigmoid function
The sigmoid function is the core of the logistic regression algorithm. It takes in any real valued number and maps it to a value between 0 and 1.

Whatever the value of x is, y takes a value between 0 and 1. It is important to have output values between 0 and 1 because we cannot have a probability value outside this range.
If x is 0, y becomes 0.5 which is the common threshold for binary classification.
Regression vs classification
Although it contains the word "regression" in its name, logistic regression is a classification algorithm. By using the sigmoid function and a few logarithm techniques, logistic regression converts a classification problem to solving a linear equation like we do in a linear regression.
Let’s first rearrange the sigmoid function.

If we take the natural log of both sides of the expression in the final step, we get the following:

The y in the log expression is used as the probability of the positive class (i.e. email is spam).
- If y = 0.5, then x = 0
- If y > 0.5, then x > 0
- If y < 0.5, then x < 0
We can replace x in equation 1 with a linear equation.

By solving the linear equation on the left hand side for 0, we can find parameters that make the probability of the positive class 0.5.
Parameters of the function are determined in training phase with maximum-likelihood estimation algorithm. Then, for any given values of features (x1, … xn), the probability of positive class can be calculated.
Conclusion
Logistic regression is a simple yet very powerful algorithm. Although there are many complex algorithms that can solve classification tasks, logistic regression is used in many applications for its efficiency.
It is important to note that we should not always use 0.5 as the threshold value to separate positive and negative class. Regarding the spam email case, we have to be almost sure in order to classify an email as spam. We do not want the user to miss important emails. Logistic regression allows us to adjust the threshold value for such tasks.
Thank you for reading. Please let me know if you have any feedback.