by Piyush Malhotra, Puneet and Tanishq Chamola

In today’s world as the number of vehicles are increasing so are the road accidents and according to reports, India is on 1st spot in most number of accidents in a country. This is caused due to many reasons such as poor enforcement of laws, carelessness etc. One of the reason is that people don’t recognize or follow traffic sign boards. So we have made a traffic sign recognizer which can inform the driver of the vehicle about the traffic sign coming ahead and to follow it. This can reduce the road accidents.
Convolutional Neural Networks
Convolutional neural networks are a part of Deep Learning and extensively used in image recogntion. These convolutional neural networks consists of several layers. First a Conv2D layer is used for feature extraction with the help of filters. Number of filters are generally in power of 2 like 32, 64 or 128. An activation function is used in this layer. Generally ReLU(Rectified Linear Unit) activation function is used. ReLU function is defined as maximum(0, x).
Next is the max pooling layer which is used reduce the dimensions of the image. This is done to reduce the computation power required for processing the image. Third is dropout layer. This dropout layer is used to prevent overfitting and to reduce the complexity of the model. In this layer some neurons are removed randomly.
The combination of first 3 layers is called feature learning phase. These 3 layers are used multiple times to improve the training.
Fourth is the flatten layer which converts the 2-D data into a long 1-D vector of features for a fully connected layer that can be fed into the neural network.
The last layer is the dense layer which is used as a output layer. The last layer has number of nodes same as the number of classes. The last dense layer uses softmax activation function. Softmax function gives the probability value (between 0 and 1) so that the model can predict which class has the highest probability.
Traffic sign recognition
1. Dataset
We have taken the dataset from German Traffic Sign Benchmark single-image classification challenge held at the International Joint Conference on Neural Networks (IJCNN) 2011 . Link – kaggle.com/meowmeowmeowmeowmeow/gtsrb-german-traffic-sign
The dataset consists of 39,209 traffic sign images.
2. Importing the necessary libraries
We will be using Python language for this. First we will import the neccessary libraries such as keras for building the main model, sklearn for splitting the training and test data, PIL for converting the images into array of numbers and other libraries such as pandas, numpy , matplotlib and tensorflow.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import cv2
import tensorflow as tf
from PIL import Image ]
import os
from sklearn.model_selection import train_test_split
from keras.utils import to_categorical
from keras.models import Sequential, load_model
from keras.layers import Conv2D, MaxPool2D, Dense, Flatten, Dropout
import tqdm
import warnings
3. Retrieving the images
We will retrieve the images and their labels. Then resize the images to (30,30) as all images should have same size for recognition. Then convert the images into numpy array.
data = []
labels = []
classes = 43
for i in range(classes):
path = os.path.join(os.getcwd(),'train',str(i))
images = os.listdir(path)
for j in images:
try:
image = Image.open(path + ''+ j)
image = image.resize((30,30))
image = np.array(image)
data.append(image)
labels.append(i)
except:
print("Error loading image")
#Converting lists into numpy arrays bcoz its faster and takes lesser #memory
data = np.array(data)
labels = np.array(labels)
print(data.shape, labels.shape)
4. Splitting the dataset
Split the dataset into train and test. 80% train data and 20% test data.
X_train, X_test, y_train, y_test = train_test_split(data, labels, test_size=0.2, random_state=68)
print(X_train.shape, X_test.shape, y_train.shape, y_test.shape)
5. Building the model
For building the we will use sequential model from keras library. Then we will add the layers to make convolutional neural network. In the first 2 Conv2D layers we have used 32 filters and the kernel size is (5,5).
In the MaxPool2D layer we have kept pool size (2,2) which means it will select the maximum value of every 2 x 2 area of the image. By doing this dimensions of the image will reduce by factor of 2. In dropout layer we have kept dropout rate = 0.25 that means 25% of neurons are removed randomly.
We apply these 3 layers again with some change in parameters. Then we apply flatten layer to convert 2-D data to 1-D vector. This layer is followed by dense layer, dropout layer and dense layer again. The last dense layer outputs 43 nodes as the traffic signs are divided into 43 categories in our dataset. This layer uses the softmax activation function which gives probability value and predicts which of the 43 options has the highest probability.
model = Sequential()
model.add(Conv2D(filters=32, kernel_size=(5,5), activation='relu', input_shape=X_train.shape[1:]))
model.add(Conv2D(filters=32, kernel_size=(5,5), activation='relu'))
model.add(MaxPool2D(pool_size=(2, 2)))
model.add(Dropout(rate=0.25))
model.add(Conv2D(filters=64, kernel_size=(3, 3), activation='relu'))
model.add(Conv2D(filters=64, kernel_size=(3, 3), activation='relu'))
model.add(MaxPool2D(pool_size=(2, 2)))
model.add(Dropout(rate=0.25))
model.add(Flatten())
model.add(Dense(256, activation='relu'))
model.add(Dropout(rate=0.5))
model.add(Dense(43, activation='softmax'))
6. Apply the model and plot the graphs for accuracy and loss
We will compile the model and apply it using fit function. The batch size will be 32. Then we will plot the graphs for accuracy and loss. We got average validation accuracy of 97.6% and average training accuracy of 93.3%.
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
history = model.fit(X_train, y_train, batch_size=32, epochs=2, validation_data=(X_test, y_test))
model.save("Trafic_signs_model.h5")
#plotting graphs for accuracy
plt.figure(0)
plt.plot(history.history['accuracy'], label='training accuracy')
plt.plot(history.history['val_accuracy'], label='val accuracy')
plt.title('Accuracy')
plt.xlabel('epochs')
plt.ylabel('accuracy')
plt.legend()
plt.show()
#plotting graphs for loss
plt.figure(1)
plt.plot(history.history['loss'], label='training loss')
plt.plot(history.history['val_loss'], label='val loss')
plt.title('Loss')
plt.xlabel('epochs')
plt.ylabel('loss')
plt.legend()
plt.show()
7. Accuracy on test set
We got a accuracy of 94.7% on test set.
from sklearn.metrics import accuracy_score
y_test = pd.read_csv('Test.csv')
labels = y_test["ClassId"].values
imgs = y_test["Path"].values
data=[]
for img in imgs:
image = Image.open(img)
image = image.resize((30,30))
data.append(np.array(image))
X_test=np.array(data)
pred = model.predict_classes(X_test)
#Accuracy with the test data
print(accuracy_score(labels, pred))
Graphical user interface
Now as the model is ready, so we can make a Graphical user inetrface(GUI). We have used tkinter library to make the GUI. Code of GUI :
Output

Conclusion
So we got to know about convolutional networks and how they can be used in image recognition. We made a traffic sign recognizer with the use of convolutional neural networks and got an accuracy of 97.6% on validation set and 94.7% on test set.
The complete code is available in the following github repository: Traffic sign recognition
Thank you.