The Multiclass Definitions

stay trying.
Towards Data Science
3 min readApr 13, 2018

--

Photo by Tuân Nguyễn Minh on Unsplash

One of the first lessons a budding machine learning programmer learns is about binary classification. It is the idea that you and your model are trying to classify an input as one of two outcomes. It is a hot-dog or not, should you reject of accept someone for a loan, or do you think a student will pass or fail a class.

Once you get the hang of that, courses start to then teach about multi-class classification. This is the idea that inputs can be classified as one of many outputs — this represents the world more closely. Maybe you would like to predict an image of a number as between 0–9. Or maybe you are curious about what kind of flower that pretty one on your neighbor’s lawn is so you train a model to find out. Some first key lessons include things like one-hot encoding or label encoding.

We are all here to learn, and I recently found the different multiclass and multilabel classifications that I would like to share with everyone. So, let’s dig in.

Multilabel Classification

This set of algorithms can be thought of as classifying your input as part of one or many classes. Think of Medium articles for one second. When authors are getting ready to publish their article, they have to decide a set of tags that represent their article. These could be tags like ‘artificial intelligence’, ‘dumb story’, or ‘Towards Data Science’.

Now, Medium or someone with some time could train a model that could learn how people tag their articles by doing some natural language processing on their article itself. Their model would then be able to predict or recommend the top 5 ‘labels’ or tags that a new article should have.

The idea here is that there is no mutual exclusivity, and the output can classify the input as one of many things.

Multiclass Classification

Conversely, multiclass classification does have mutual exclusivity. If we extend the Medium analogy a little bit further, the same model would only predict or recommend one of the tags instead of many.

This type of problem is more commonly talked about within the machine learning guides out there because the training sets may have a defined ground truth. For example, if you have a classifier that is predicting dog breeds, you would want the model to choose one output instead of two.

Interestingly, there are a couple of sub-classes within this set of methods. There are one vs. one and one vs. all/rest classification. Here are a couple of links that succinctly explain these differences. In essence, the are smart ways to divide the multi-class classification into easier sub-problems, particularly binary classification problems.

Multioutput Regression

This classification method is similar to multiclass classification but instead of a class that the model is predicting, the model is spitting out a number or continuous variable for the result. If you are looking to create a model that outputs stock price of Apple as well as the momentum of the next move, this may be the way to go.

As usual, there is always more to learn within even the first topics one may learn in machine learning. If you want to dive even deeper into these methods, you can check out the different algorithms that fall into each category. SVC, naive bayes, and random forests can fit into different categories, and it may give you a better understanding of the differences between these sets of tools.

Thanks for reading.

--

--