Building and Deploying an Alphabet Recognition System

Using Anvil to deploy a Convolutional Neural Networks (CNN) model into a website

Sakshi Butala
Towards Data Science

--

After Deployment

In this article, I am going to show you how to build an Alphabet Recognition System using Convolutional Neural Networks (CNNs) and deploy it using anvil.works. At the end of this post, you will be able to create an exact replica of the system shown above.

Table of Contents

Convolutional Neural Network

CNN Implementation

Anvil Integration

Convolutional Neural Network

Let’s start by understanding what exactly is a Convolutional Neural Network. A Convolutional Neural Network (CNN) is a type of neural network widely used for image recognition and classification.

CNNs are regularised versions of multilayer perceptrons. Multilayer perceptrons usually mean fully connected networks, that is, each neuron in one layer is connected to all neurons in the next layer.

CNNs consists of the following layers:

Convolution layer: A “kernel” of size for example, 3X3 or 5X5, is passed over the image and a dot product of the original pixel values with weights defined in the kernel is calculated. This matrix is then passed through an activation function “ReLu” that converts every negative value in the matrix to zero.

Image explaining the operation of convolution layer by Shervine Amidi

Pooling layer: A “pooling matrix” of size, for example, 2X2 or 4X4, is passed over the matrix to reduce the size of the matrix so as to highlight only the important features of the image.

There are 2 types of pooling operations:

  1. Max Pooling is a type of pooling in which the maximum value present inside the pooling matrix is put inside the final matrix.
  2. Average Pooling is a type of pooling in which the average of all the values present inside the pooling kernel is calculated and put inside the final matrix.
Image explaining max and average pooling via Stack Overflow

(Note: There can be more than one combination of Convolution and Pooling layer in a CNN architecture to improve its performance.)

Fully connected layer: The final matrix is then flattened into a one-dimensional vector. This vector is then passed into the neural network. Finally, the output layer is a list of probabilities for different possible labels attached to the image (e.g. alphabets a,b,c). The label that receives the highest probability is the classification decision.

Image displaying fully connected layer by Shervine Amidi

CNN Implementation

Let’s start the implementation by importing the libraries inside a Jupyter Notebook as shown below:

import numpy as np
import matplotlib.pyplot as plt
from keras.preprocessing.image import ImageDataGenerator
from keras.preprocessing import image
import keras
from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Activation
import os
import pickle

Then, let us import the 2 datasets containing images from a to z for training and testing our model. You can download the datasets from my GitHub repository linked below.

train_datagen = ImageDataGenerator(rescale = 1./255,
shear_range = 0.2,
zoom_range = 0.2,
horizontal_flip = True)

test_datagen = ImageDataGenerator(rescale = 1./255)

train_generator = train_datagen.flow_from_directory(
directory = 'Training',
target_size = (32,32),
batch_size = 32,
class_mode = 'categorical'

)

test_generator = test_datagen.flow_from_directory(
directory = 'Testing',
target_size = (32,32),
batch_size = 32,
class_mode = 'categorical'

)

ImageDataGenerator generates batches of tensor image data, converting the the RGB coefficients in range 0–255 to target values between 0 and 1 by scaling with a 1/255 factor using rescale.

shear_range is used for randomly applying shearing transformations.

zoom_range is used for randomly zooming inside pictures.

horizontal_flip is used for randomly flipping half of the images horizontally.

Then we import the images one by one from the directories using .flow_from_directory and apply the ImageDataGenerator on it.

We then convert the images from its original size to our target_size and declare the batch_size count which refers to the number of training examples used in one iteration.

Then we set the class_mode to categorical indicating that we have multiple classes (a to z) to predict from.

Next we build our CNN architecture.

model = Sequential()
model.add(Conv2D(32, (3, 3), input_shape = (32,32,3), activation = 'relu'))
model.add(MaxPooling2D(pool_size = (2, 2)))


model.add(Conv2D(32, (3, 3), activation = 'relu'))
model.add(MaxPooling2D(pool_size = (2, 2)))

model.add(Flatten())
model.add(Dense(units = 128, activation = 'relu'))
model.add(Dense(units = 26, activation = 'softmax'))


model.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])

model.summary()

We start by creating a Sequential model which allows us to define the CNN architecture layer by layer using the .add function.

We first add a convolution layer with 32 filters of size 3X3 on the input images and pass it through the ‘relu’ activation function.

We then perform MaxPooling operations using a pool of size 2X2.

These layers are then repeated once again to improve the performance of the model.

Finally we flatten our resultant matrix and pass it through a dense layer consisting of 128 nodes. This is then connected to the output layer consisting of 26 nodes, each node representing an alphabet. We use the softmax activation which converts the scores to a normalised probability distribution, and the node with the highest probability is selected as the output.

Once our CNN architecture is defined, we compile the model using adam optimizer.

Lastly, we train our model as follows.

model.fit_generator(train_generator,
steps_per_epoch = 16,
epochs = 3,
validation_data = test_generator,
validation_steps = 16)

The accuracy achieved after training the model is: 93.42%

Let’s now try testing our model. But before we do that, we need to define a function that gives us the associated alphabet with the result.

def get_result(result):
if result[0][0] == 1:
return('a')
elif result[0][1] == 1:
return ('b')
elif result[0][2] == 1:
return ('c')
elif result[0][3] == 1:
return ('d')
elif result[0][4] == 1:
return ('e')
elif result[0][5] == 1:
return ('f')
elif result[0][6] == 1:
return ('g')
elif result[0][7] == 1:
return ('h')
elif result[0][8] == 1:
return ('i')
elif result[0][9] == 1:
return ('j')
elif result[0][10] == 1:
return ('k')
elif result[0][11] == 1:
return ('l')
elif result[0][12] == 1:
return ('m')
elif result[0][13] == 1:
return ('n')
elif result[0][14] == 1:
return ('o')
elif result[0][15] == 1:
return ('p')
elif result[0][16] == 1:
return ('q')
elif result[0][17] == 1:
return ('r')
elif result[0][18] == 1:
return ('s')
elif result[0][19] == 1:
return ('t')
elif result[0][20] == 1:
return ('u')
elif result[0][21] == 1:
return ('v')
elif result[0][22] == 1:
return ('w')
elif result[0][23] == 1:
return ('x')
elif result[0][24] == 1:
return ('y')
elif result[0][25] == 1:
return ('z')

Finally, let us test our model as follows:

filename = r'Testing\e\25.png'
test_image = image.load_img(filename, target_size = (32,32))
plt.imshow(test_image)
test_image = image.img_to_array(test_image)
test_image = np.expand_dims(test_image, axis = 0)
result = model.predict(test_image)
result = get_result(result)
print ('Predicted Alphabet is: {}'.format(result))

The model correctly predicts the input image alphabet to be ‘e’.

Anvil Integration

Anvil is a platform that allows us to build full stack web applications with python. It makes it easier for us to turn machine learning model from a Jupyter notebook into a web application.

Let’s start by creating a account on anvil. Once done, create a new blank app with material design.

Check out this link for a step by step tutorial on how to use anvil.

The toolbox on the right contains all the components that can be dragged onto the website.

Components needed:

  1. 2 Labels (For the heading and sub heading)
  2. Image (To display the input image)
  3. FileLoader (To upload the input image)
  4. Highlighted Button (To predict the results)
  5. Label (To view the results)

Drag and drop these components and arrange them as per your requirement.

In order to add heading and subheading, select the label and in the properties section on the right side and go to the option named ‘text’ as shown below (highlighted in red), and type the heading/subheading.

Once the User Interface is completed, go inside the Code section as shown above (highlighted in green) and create a new function as follows

def primary_color_1_click(self, **event_args):
file = self.file_loader_1.file
self.image_1.source = file
result = anvil.server.call('model_run',file)
self.label_3.text = result
pass

This function will execute when we press the PREDICT button. It will take the input image uploaded from the file loader and pass it to the jupyter notebook’s ‘model_run’ function. This function will return the predicted alphabet which is displayed via the label component (label_3).

All that is left to do now is connecting our anvil website to the jupyter notebook.

This requires the implementation of 2 steps as follows:

  1. Import Anvil uplink key: click on the settings button and then click on uplink, click on enable uplink key and copy the key.

Inside your jupyter notebook paste the following:

import anvil.server
import anvil.media
anvil.server.connect("paste your anvil uplink key here")

2. Create a function ‘model_run’ which predicts the image uploaded in the website.

@anvil.server.callable
def model_run(path):
with anvil.media.TempFile(path) as filename:
test_image = image.load_img(filename, target_size = (32,32))
test_image = image.img_to_array(test_image)
test_image = np.expand_dims(test_image, axis = 0)
result = model.predict(test_image)
result = get_result(result)
return ('Predicted Alphabet is: {}'.format(result))

And, yes!!!! Now u can go back to anvil and hit the run button to discover a fully accomplished Alphabet Recognition System.

You can find the source code and the datasets in my GitHub repository.

References

--

--