
Building a Road Sign Classifier
Learning to use the power of CNNs
Every year, automakers are adding more advanced driver-assistance systems (ADAS) to their fleets. These include adaptive cruise control (ACC), forward collision warning (FCW), automatic parking, and more. One study found that ADAS could prevent up to 28% of all crashes in the United States. This technology will only improve, and will eventually develop into Level 5, fully autonomous cars.
For a car to completely drive itself, it needs to be able to understand its environment. This includes other vehicles, pedestrians, and road signs.
Road signs give us important information about the law, warn us about dangerous conditions, and guide us to our desired destination. If a car cannot distinguish the differences in symbols, colours, and shapes, many people could be seriously injured.
The way a car sees the road is different from how we perceive it. We can all tell the difference between road signs and various traffic situations instantly. When feeding images through to a computer, they just see ones and zeros. That means we need to teach the car to learn like humans, or at least identify signs like us.
To solve this problem, I tried building my own convolutional neural network (CNN) to classify traffic signs. In this process, there are three main steps: preprocessing images, building the convolutional neural network, and outputting a prediction.
Preprocessing images
In the preprocessing stage, the images are imported from the "german-traffic-signs" Bitbucket repository. This contains a dataset of labelled images which will allow us to build a supervised learning model. This repository can be cloned to a Google Colab notebook, making it easy to import the dataset and start Coding.
Now to make use of this dataset, the images are going to be fed through a greyscale and equalize function.
Greyscale
Currently, the images from the repository are three-dimensional. This is because coloured pictures have three colour channels – red, green, and blue (RGB) which are stacked onto each other to give them their vibrant colours.
For this machine Learning model, three layers of images aren’t necessary, only the features of the signs are needed. So, passing the dataset images through a greyscale function cleans up our data and filters only the important information, also reducing the images to a single dimension.

Equalize
Now that the images are greyscaled, they have lost some of their contrast, or the whiteness or blackness of pixels. To increase the contrast, the images must be equalized. This is important because the model has to distinguish various features which are picked up by their changes in contrast.
Equalizing an image means spreading out the pixel value distribution, creating a wider range of the whiteness and blackness of the image.

Convolutional neural network
A convolutional neural network is a class of deep learning networks, used to analyze visual imagery. In this case, it is being used to find unique sets of features between the variety of road signs.

The process it uses is similar to how our eyes and brains sort everything we see. For example, when looking at a set of numbers, you can tell the difference between a 1 and an 8. A 1 is a vertical line, while an 8 is a loop on top of another loop. Of course, you don’t actually say this in your head because we’ve seen them so many times, it has become a habit.
How do they learn?
For a convolutional neural network to extract the important features of an image, they use kernels to scan over or stride over an image.
I think of it as your eyes moving in saccades over an image. They analyze one part and move horizontally to the next section until you’ve seen the whole picture.

Kernels compare the difference between what they see to what they’re looking for. When a feature matches, it is recorded and stored in the feature map. These feature maps are refined versions of the original image. They save the important features of the sign and ignore the rest. Several different kernels go over the original image and extract different important features, then they join to create the final convolved pattern.

Solving Overfitting
When working with a small dataset like the one used in the model, an issue called overfitting arises. This is when the model starts to memorize the images, instead of working to find their features. More specifically, when the model goes through too many epochs (basically how many times the model goes through the dataset), it starts listening to the input of some nodes and ignoring others. This reduces the accuracy of the model because it won’t know how to classify any new images from outside the dataset.
To solve it, a dropout layer is added. This is a simple fix to this model. By dropping out a random subset of nodes, it prevents the overfitting because the nodes can’t memorize the labels (because there’s a high probability that the node will be turned off). It’s like the teacher who calls on the kid who isn’t paying attention in class. By embarrassing him and getting his attention, he’ll (hopefully) focus and provide value to the class.
Prediction
Finally, the model is provided with an image of a traffic sign, it’s run through the Convolutional Neural Network, and spits out the number associated with the corresponding sign.
When the following random sign is run through the model…

The model predicts the class as [1], which is correct!

For anyone interested in the code, you can find it on my GitHub, here!
Key Takeaways
- Images are preprocessed with a greyscale and equalize function
- A Convolutional Neural Network (CNN) uses kernels to extract the features of a sign
- Features are compared to other classified images to make a prediction
Hey, I’m Kael Lascelle, a sixteen-year-old Innovator at The Knowledge Society! I have a passion for autonomous systems, especially self-driving cars, as well as sustainable energy.
I would appreciate it if you could follow me on Medium and Twitter! Also, add me on LinkedIn, or send me an email.