Neural Network tutorial with Devanagari Characters using PyTorch

Training an artificial neural network for recognizing hand-drawn Devanagari characters with PyTorch.

Gopal Singh
Towards Data Science

--

Source: images.google.com

As the name suggests, Neural Networks are loosely inspired by human brain neurons. But we won’t be covering any details about the brain analogies here. Rather, we will understand it with some math and coding.

Perceptron

Frank Rosenblatt proposed perceptron in the 1950s, showing that an algorithm can imitate the human brain decision-making capabilities (and we are still trying). He authored in his paper that we can consider input as a neuron consisting of binary numbers. Those binary numbers can generate a single binary number as an output if and only if the output satisfies a certain threshold.

Perceptron model

You may ask:

But Gopal, we can also write a program to do this task; why bother writing a neural network? I am glad you asked.

The first reason for choosing a Neural Network over any program is that they are universal function approximators, which infers to what model we are trying to build, or if it is too complex, neural networks always represent that function.

We can assume to put any function into mathematical terms, and then we can use a neural network to represent that function.

The second reason is scalability and flexibility. We can stack layers easily in neural networks, which will increase the complexity of the neural network.

The basic architecture of neural networks

Neural networks consist of the following components:

  • An input layer, x
  • An arbitrary amount of hidden layers
  • An output layer, 𝑦
  • A set of parameters(𝑊) and biases(𝑏) between each layer,𝑊, and 𝑏
  • A choice of activation function for each hidden layer, 𝜎
Resource from https://tex.stackexchange.com/questions/132444/diagram-of-an-artificial-neural-network

We train our neural network for n iterations; each iteration consists of two steps:

  1. Feedforward
  2. Backpropagation

Feedforward:

In simple terms, when the output of the first layer becomes the input of the next layer. Such networks are called Feedforward Networks.

There are no loops in our network to feed the information backward; it will always be fed forward.

Backpropagation:

This is the process where our neural networks actually learn from the training data.

But the question is still unanswered, how does our network learn to classify or predict? And if our model has predicted something, then how can our model be sure if it’s the right prediction or wrong?

The answer is the loss function; it helps in how much our network’s prediction was far from its original value.

Our model has predicted the price of a house to be $100K, and the original price is $101K, then the difference between original and predicted will be $1K; that’s what loss function helps our network to decide.

Coming back to the backpropagation, after we calculate the error in our model’s prediction and original value with the help of the loss function, we send back this error to update our input neurons or weights and biases. This is called the Backpropagation.

But how much do we need to update our weights and biases?

To know the appropriate amount to adjust the weights and biases, we have to derive our loss function for weights and biases.

I think now we have enough intuition to start our coding section.

Implementing Artificial Neural Network for classifying hand-drawn Devanagari Characters

We will be using the PyTorch library to build our Neural Network.

I wrote a small program plot_images for displaying the characters along with its labels.

Let’s see what do we have in our CSV file.

df.head() will give us the top 5 columns of our data-frame. In our dataset, we have pixel values from 0 to 1023. A character column for showing a particular character consists of particular pixel values.

Let’s run our plot_images function and take a look at the images.

>>plot_images(df, 4, "character")

We have 46 unique hand-drawn characters in our dataset; hence, number 46 will be our neural network's output dimension.

But before creating the neural network, we need to prepare the data loader for feeding it to our model for training and testing because NumPy data does not work with the PyTorch’s libraries.

So, in the above code sample, we read the data and separate the features and labels from it. If you notice, I am also dividing the features_numpy by 255.0; I am doing this for normalizing our pixel values.

I am then converting our categorical labels into codes because we can create tensor values only with the numerical data.

In our data_loader function, we are taking the features and targets; if they are in NumPy data, we are converting them into tensors; after that, we are creating the Tensor Data with the torch.utils.data.TensorDataset, and finally converting the data into a data loader.

That’s it; our data is now ready for feeding into the model. Let’s build the Neural Network now.

In our ANNModel, we take the 32*32, and the output dimension input in the last layer is 46. But we have 2 new terms here that I haven’t talked about yet.

Dropout: It removes the random activation weights with a probability value. Let’s say we have set the probability value equals to 0.2. Then for every feedforward or backpropagation process, it will ignore the 20% of neurons. It helps in the prevention of overfitting.

Softmax: In mathematics, the softmax function, also known as softargmax or normalized exponential function, is a function that takes as input a vector of K real numbers and normalizes it into a probability distribution consisting of K probabilities proportional to the exponentials of the input numbers.

By understanding those two terms, we can now proceed with the training process.

The training process is simple; we are iterating through our training data loader's images and labels. Then clearing the initial gradient values and making predictions afterward. After running this, you should get an accuracy of over 94% on the testing data.

It won’t be possible to explain each step; please feel free to ask me in the comment section instead.

After the training, let’s see our training and validation curves.

Training vs. Validation loss

It looks good. :)

Inference:

Let’s check how our model is doing on the test data.

>> make_predictions(test_loader, 44)
Inference

It’s doing pretty much great. :)

References

Dataset:

--

--