
Introduction
These days I am working to develop a Machine Learning algorithm that is able to identify different types of damage found in buildings. The damages are not all the same, they each have different causes and risks, so we have identified about 4 different types of fractures. The algorithm will then be deployed on a drone that automatically is going to take pictures of the building and will be able to tell what and what severity of damage is in the building.
Obviously, in a photo taken by the drone, there could be different types of damage, for this reason, the drone given a photo must be able to identify all the different classes of damage present in the photo and not just one. That’s why I started studying the so-called Multitargeting classification task. And I’m here writing this article hoping it will be useful to you too.
What is Multitarget Classification?
Multitarget classification is a type of Machine Learning task that involves predicting multiple labels for a single sample. Unlike traditional binary or multiclass classification, where each sample is assigned to a single class, multitarget classification allows a sample to belong to multiple classes simultaneously. This can be useful in situations where a single sample can have multiple relevant labels, such as a news article that can be classified as being about politics, sports, and entertainment at the same time.
Let’s see an example to understand the different types of classifications.

There are several approaches to tackling multitarget classification problems, including the use of binary classifiers, multiclass classifiers and multitask learning. In this article, we will explore the different types of multitarget classification and discuss their pros and cons. We will also look at evaluation metrics for multitarget classification. Finally, I will offer some personal insights and conclusions on the advantages and limitations of multitarget classification and best practices for success.
Types of Multitarget Classification
There are several approaches to multitarget classification, each with its own advantages and limitations.
Binary Classifiers

One approach to multitarget classification is to use multiple binary classifiers, where each classifier is trained to predict a single label. For example, if we have a multitarget classification problem with three labels (A, B, and C), we can train three separate binary classifiers, one to predict label A, one to predict label B, and one to predict label C and then run all three models to classify an instance. This approach is simple and easy to implement, but it can be inefficient if the number of labels is large. In addition, the performance of the classifiers may be affected by the imbalanced distribution of labels in the training data.
Multiclass Classifiers
Another approach to multitarget classification is to use a multiclass classifier, which is designed to predict multiple labels simultaneously. There are several types of multiclass classifiers, including one-vs-rest and one-vs-one.
- One-vs-rest (OvR) classifiers are trained to make a binary decision for each label, treating all other labels as the negative class. For example, in the case of three labels (A, B, and C), an OvR classifier would be trained to predict label A vs. not-A, label B vs. not-B, and label C vs. not-C. So you will end up in the case of having multiple binary classifications as seen before. This approach is simple and efficient, but it can suffer from imbalanced label distributions and may not take into account the dependencies between labels.
- One-vs-one (OvO) classifiers are trained to make a binary decision for each pair of labels. For example, in the case of three labels (A, B, and C), an OvO classifier would be trained to predict A vs. B, A vs. C, and B vs. C. This approach is more computationally intensive than OvR, but it can handle imbalanced label distributions and capture the dependencies between labels.
Multitask Learning
Multitask learning is a type of Machine Learning that involves training a model to predict multiple tasks simultaneously. In the context of Multitarget Classification, Multitask Learning consists in training a single model to predict all the labels for a sample.
If the tasks are similar, for example, you have to classify different types of defects, or you have to classify the presence of a car a bike, and a truck, this approach is more efficient than using multiple binary or multiclass classifiers, but it requires a large amount of labeled data and a strong assumption that the tasks are related.
Let’s code!
Let’s see how to implement a multitask learning algorithm in the field of Computer Vision, while also adopting a transfer learning methodology.
What I want to do is to take a network that is pre-trained on image recognition like Resnet (you can use others, of course) and modify it so that it can solve multiple tasks simultaneously.
What is usually done in multiclass classification cases is to use Resnet and attach on top of it a classifier that has as many output neurons as there are classes in the dataset, and thus get a class for each instance.
We, however, want to recognize not just one class but multiple classes at the same time, so we are simply going to attach multiple classifiers (linear layers) on top of Resent.
In our case though, each classifier is a binary classifier, it only needs to tell us whether or not a car a bike or a truck is present. To create a binary classifier, we only need one output neuron to answer no/yes. So the network architecture is pretty simple.

In a more general case where you need 3 multiclass classifiers, the architecture should be the following.

Let’s see how to actually implement such a network using PyTorch.
class ResnetBasedModel(nn.Module):
def __init__(self, pretrained, clf_in_features, labels_nr:int, freeze:bool = True):
super().__init__()
self.pretrained_model = pretrained
#model without last layer
self.model_wo_fc = nn.Sequential(*(list(self.pretrained_model.children())[:-1]))
if freeze:
for param in self.model_wo_fc.parameters():
param.requires_grad = False
self.classifiers = nn.ModuleDict()
for i in range(labels_nr):
self.classifiers[f'clf_{i}'] = nn.Sequential(
nn.Dropout(p=0.2),
nn.Linear(in_features = clf_in_features, out_features = 1)
)
def forward(self, x):
x = self.model_wo_fc(x)
x = torch.flatten(x, 1)
return {name: classifier(x) for name, classifier in self.classifiers.items()}
The previous code implements a Python class that inherits nn.Module, which is the classic way to create a neural network-based model.
The model takes as input a pre-trained network such as Resnet (pretrained), it takes as input the number of output neurons of the second-last layer of the pre-trained network (_clf_infeatures) in Resnet34 for example this number is 512 while in Resnet50–101 is 2048.
The class also takes in input the number of output binary classifiers (_labelsnr), and whether or not we want to free the parameters of the pre-trained network (freeze).
Let’s see in more detail how this class works.
The following loop is used to free up the parameters of the pre-trained network so that we are going to tow only the output classifiers, this will greatly speed up our training.
if freeze:
for param in self.model_wo_fc.parameters():
param.requires_grad = False
After that, I create as many classifiers as specified in the arguments by saving them in a dictionary.
for i in range(labels_nr):
self.classifiers[f'clf_{i}'] = nn.Sequential(
nn.Dropout(p=0.2),
nn.Linear(in_features = clf_in_features, out_features = 1)
)
Eventually, in the forward method, I pass each input x for the pre-trained network and for each classifier in the dictionary and return the output of each element (classifier) in the dictionary.
def forward(self, x):
x = self.model_wo_fc(x)
x = torch.flatten(x, 1)
return {name: classifier(x) for name, classifier in self.classifiers.items()}
Now you can use your network for multitargeting classification. Remember that to instantiate a pre-trained network and pass it as input to the class you only need to use a PyTorch Torchvision module.
from torchvision import models
resnet34 = models.resnet34(weights=models.ResNet34_Weights.DEFAULT)
To train the model you will then have to instantiate a loss function for each classifier and sum these losses. With the result, you can update the entire model,
def criterion(y, yhat):
'''y : is a dict with keys 'labels' and 'path'''
losses = 0
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
for _, key in enumerate(yhat):
losses += loss_func(yhat[key], y[f'label_{key}'].float().unsqueeze(1).to(device))
return losses
Evaluation Metrics for Multitarget Classification
Evaluating the performance of a multitarget classification model can be challenging, as there are multiple labels to consider and different ways of combining them. Here are some common evaluation metrics for multitarget classification:
Hamming loss: This metric measures the fraction of labels that are incorrectly predicted. It is calculated as the number of misclassified labels divided by the total number of labels.
This is the metric I preferred to use to evaluate my model, and here is an implementation of hamming loss.
def hamming_error(yhat:list, y:list) -> float:
loss = sum([yhat_i != y_i for yhat_i, y_i in zip(yhat, y)])
avg_loss = loss/len(yhat)
return avg_loss
In this snippet, the elements of two arrays are compared in pairs and the average of the mismatches found is returned. One can generalize this code easily to compare entire batches at a time. Let’s quickly look at other metrics used in this area.
- Ranking loss: This metric measures the average number of wrongly ranked labels for a sample. For example, if a sample has labels A, B, and C, and the model predicts the labels in the order C, B, A, the ranking loss is 2 (since B and C are wrongly ranked).
- Jaccard index: This metric measures the overlap between the predicted labels and the true labels for a sample. It is calculated as the size of the intersection divided by the size of the union.
- F1 score: This metric is a balance between precision and recall, where precision is the fraction of predicted labels that are correct and recall is the fraction of true labels that are predicted. It is calculated as the harmonic mean of precision and recall. To use the F1 score for multitarget classification, you will need to calculate the precision and recall for each label separately, and then average the scores across all labels to obtain the overall F1 score. This metrics is more is sensitive to imbalanced label distributions respect to the previous metrics.
F1 = 2 * (precision * recall) / (precision + recall)
- Average precision: This metric measures the precision at each recall value for all samples in the dataset. It is calculated as the average of the precisions at the first false positive, the second false positive, and so on.
Average precision = (1/n) * Σ(precision at each recall value)
To calculate the average precision for multiple labels, you can simply average the average precision for each label. For example, if you have three labels (A, B, and C), you can calculate the average precision for each label using the formula above, and then average the scores to obtain the overall average precision. This is also sensitive to imbalanced distribution.
Challenges in Evaluating Multitarget Classification Models
Evaluating multitarget classification models can be challenging due to the following factors:
- Imbalanced label distributions: Some labels may be more common than others, which can affect the performance of the model. For example, if a label is rare, the model may not have enough examples to learn from, leading to poor performance.
- Dependencies between labels: Some labels may be more likely to occur together than others, which can affect the performance of the model. For example, if a model is trained to predict labels A and B, but label A is always preceded by label B, the model may have difficulty predicting label A without also predicting label B.
- Multi-label evaluation metrics: There are multiple evaluation metrics for multitarget classification, each with its own strengths and limitations. Choosing the right metric can be difficult, as it depends on the specific requirements of the problem and the characteristics of the data.
Final Thoughts
Multitarget classification is a powerful tool for solving problems that involve predicting multiple labels for a single sample. It can be applied to a wide range of real-world applications, such as text classification, image annotation, and recommendation systems.
There are several approaches to multitarget classification, including binary classifiers, multiclass classifiers, and multitask learning. The choice of approach depends on the specific requirements of the problem and the characteristics of the data.
Evaluating the performance of a multitarget classification model can be challenging due to imbalanced label distributions, dependencies between labels, and the availability of multiple evaluation metrics. It is important to choose the right evaluation metric and to compare the performance of the model to a baseline.
The End
Marcello Politi