
Support Vector Machines (SVMs) are one of the most popular machine learning models in the Data Science world. Intuitively, it’s a rather simple concept. Mathematically speaking, however, support vector machines can seem like a black box.
In this article, I have two goals:
- I want to demystify the mechanics underlying support vector machines and give you a better understanding of its overall logic.
- I’ll want to teach you how to implement a simple SVM in Python and deploy it using Gradio. By the end, you’ll be able to build something like this:
With that said, let’s dive right into it!
Refresher: What is a Support Vector Machine?

A **** Support Vector Machine (SVM) is a supervised classification technique. The essence of SVMs simply involves finding a boundary that separates different classes from each other.
- In 2-dimensional space, the boundary is called a line.
- In 3-dimensional space, the boundary is called a plane.
- In any dimension greater than 3, the boundary is called a hyperplane.
Let’s assume that there are two classes of data. A support vector machine will find a **** boundary that maximizes the margin between the two classes (see image above). There are many planes that can separate the two classes, but only one plane can maximize the margin or distance between the classes.
Notation
_n = number of data points m = number of attributes x_ij = ith attribute of jth data point yj = 1 if data point is blue, -1 if data point is red
The Math

After reading this, you’ll understand what the equation above is trying to achieve. Don’t worry if it looks confusing! I will do my best to break it down step by step.
Keep in mind that this covers the math for a fundamental support vector machine and does not consider things like kernels or non-linear boundaries.
Breaking this down, we can separate this into two separate parts:

Red Part: The red part focuses on minimizing the error, the number of falsely classified points, that the SVM makes.
Blue Part: The blue part focuses on maximizing the margin, which was discussed earlier in the article.
Let’s first talk about the blue part, or the second term of the equation.
Second term: Maximizing the margin

As I said earlier, there are many boundaries that you can put in between two classes of data points, but there is only one boundary that maximizes the margin between the two classes (as shown by the dotted lines above).
We want the decision boundary to be as far away from the support vectors as possible so that when we have new data, it will fall in one or the other class with greater certainty.

Let’s suppose that the two equations above represent each side of the margin. Don’t worry so much about m/-m. Just notice how they represent the equation of a line.
The distance between the two dotted lines is found from the following formula, which you can read more about here:

Don’t worry so much about how this equation is derived. Rather, notice that as all a’s between 1 and m (a1, a2, … am) get smaller, than the denominator gets smaller, and the distance or the margin gets larger!

Now that you understand what the second term (blue part) means, lets talk about the first term (red part).
First term: Minimizing Error
In reality, it’s not going to be the case that you’ll be able to find a hyperplane that perfectly separates different classes from each other. And even if it exists, it’s not always the case that you’ll want to use that hyperplane.
Consider the image below:

Technically, we could set the boundary so that the red and blue classes are on the right side of the boundary. However, given that the blue square on the left is an outlier, it may be more ideal to have an imperfect hyperplane with a larger margin, this is known as a soft margin:

Now that we’ve introduced the concept of "error", you should understand that the full equation for Support Vector Machines is trying to minimize error while maximizing the margin.
Now that we understand the goal behind the first term, let’s revisit the equation:

In English, this equation says to "take the sum of the errors of each point".

So how does ^ this part of the equation represent the error of each point? Let’s dive into that:

Let’s set m and -m to 1 and -1 respectively. In reality, they can be any number as m is a scaling factor of the margin.
Since y_j = 1 if data point is blue, -1 if data point is red, we can combine the equation for the upper boundary and the equation for the lower boundary to represent all points:

This can be re-written as the following:

This equation assumes that all points are classified on the right side of the equation. For any point that is on the wrong side of the boundary, it will not satisfy the equation.

For some of you, I bet the light bulbs are lighting up in your head. If not, no worries! We’re almost at the end.

Remember when I said that this equation "takes the sum of the errors of each point." Specifically, it’s taking the max of zero, and the second part. The rule is as follows:

Let the equation above represent Z.
- If a given point is on the right side of the line, then Z will be greater than 1. This means the second part of the first term will be a negative number, so the given point will return 0 (no error)
- If a given point is on the wrong side of the line, then Z will be less than 1. This means the second part of the first term will be a positive number, so the given point will return a value greater than 0 (error).
Conclusion

And that’s it! To summarize, the objective of support vector machines is to minimize the total error and maximize the margin by minimizing a_i.
What’s next? Building an SVM.
Now that you understand the math behind SVMs, the next step is to actually build a support vector machine model in Python and deploy it!
I’m going to use the classic iris data set to show how you can build a support vector machine in Python (See the full code _here_).
Setup
Before starting you’ll need to install the following libraries:
- Numpy
- Pandas
- Seaborn
- Sklearn
- Gradio
# Importing libraries
import numpy as np
import pandas as pd
import seaborn as sns
# Importing data
iris=sns.load_dataset("iris")
Data Preparation
from sklearn.model_selection import train_test_split
# Splitting features and target variables
X=iris.drop("species",axis=1)
y=iris["species"]
# Splitting data into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25)
Data Modelling
from sklearn.svm import SVC
# Creating model
model = SVC(probability=True)
model.fit(X_train,y_train)
Creating Web App using Gradio
# Writing a prediction function
def predict_flower(sepal_length, sepal_width, petal_length, petal_width):
df = pd.DataFrame.from_dict({'Sepal Length':[sepal_length],
'Sepal Width': [sepal_width],
'Petal Length': [petal_length],
'Petal Width': [petal_width]
})
predict = model.predict_proba(df)[0]
return {model.classes_[i]: predict[i] for i in range(3)}
# Importing gradio
import gradio as gr
# Creating Web App
sepal_length = gr.inputs.Slider(minimum=0, maximum=10, default=5, label="sepal_length")
sepal_width = gr.inputs.Slider(minimum=0, maximum=10, default=5, label="sepal_width")
petal_length = gr.inputs.Slider(minimum=0, maximum=10, default=5, label="petal_length")
petal_width = gr.inputs.Slider(minimum=0, maximum=10, default=5, label="petal_width")
gr.Interface(predict_flower, [sepal_length, sepal_width, petal_length, petal_width], "label", live=True).launch(debug=True)
And there you have it! You should have a fully function web app where you can play around with the inputs of the model and immediately see the output probabilities.
Thanks for Reading!
If you made it to the end, congrats! You should now have a strong understanding of how a basic support vector machine works, and you now know how to build your own fully function SVM web application.
If you enjoyed this, please give this some claps and follow me on Medium!
As always I wish you best in your learning endeavors. 😀
Not sure what to read next? I’ve picked another article for you:
A Complete 52 Week Curriculum to Become a Data Scientist in 2021
and another!
4 Machine Learning Concepts I Wish I Knew When I Built My First Model
Terence Shin
- If you enjoyed this, follow me on Medium for more
- Interested in collaborating? Let’s connect on LinkedIn
- Sign up for my email list here!