Neural Network Sidekick

Published in

Towards Data Science

2 min readSep 13, 2018

The idea of this article is to train a sidekick neural network specifically on data that our primary network struggles with. A built-in hierarchy between networks analogous to Batman and Robin will help the model perform better.

Ensemble learning is a very exciting concept despite computational complexity challenges. There are two dominant classes of ensemble learning methods, bagging and boosting. With bagging, random samples from the training data are used to build separate models, and then take a majority vote from these models to determine the output. The boosting algorithm is more interesting. Boosting builds a model and then evaluates which subset of data the model made the most errors on, and then uses mostly that data to build the subsequent model on. This can be computationally expensive and confusing on how to determine a solution algorithm.

In a recent project, I have built a monster image classifier using many convolutional layers and thousands of data images. I have tried every trick in the book to boost the accuracy of this model. However, this model still makes mistakes sometimes. A model that performs at about 90–95% can be considered a job well done, unfortunately, some tasks require higher accuracy than this.

Boosting seems to be a great solution to support this monolithic network. Analogous to Batman and Robin, my primary convolutional network is the Batman and a convolutional network built off of primarily misclassified instances is the Robin. Robin does not need to be the face of the team, but rather should focus on learning Batman’s miscellaneous tendencies. By becoming an expert on Batman, who will likely solve the problem by himself anyways, Robin is able to contribute to the team.

Implementing boosting with neural networks is tricky because there isn’t exactly a library function that we can call such as, Boosted_CNNs.

This is the rough workflow of how I implemented boosting to create a Batman and Robin type of classification model:

Partition Train/Test split
Train CNN with Train dataEvaluate CNN on Train data outside of training loop
Select misclassified instances and 25% of the dataTrain CNN on this new subsetWith future predictions weight the output of CNN1 80% & CNN2 20%Evaluate model

I found that to make this work you will need a lot of data, however, I did see a slight boost in accuracy for my specific problem. Thanks for reading!

CShorten

Connor Shorten is a Computer Science student at Florida Atlantic University. Research interests in computer vision, deep learning, and software engineering.

Neural Network Sidekick

CShorten

Written by Connor Shorten