The world’s leading publication for data science, AI, and ML professionals.

Your First Neural Network in PyTorch

Let's get that PC warm.

With the field of Deep learning getting hotter by the day and plethora of advanced articles on the web, it’s easy to think of deep learning as some advanced field reserved only for math Ph.D.’s – but let’s prove you wrong.

Photo by Aziz Acharki on Unsplash
Photo by Aziz Acharki on Unsplash

The field of Deep Learning, at least the practical part, was never easier to get started with – as the number of resources is growing and libraries are getting better.

This article is aimed towards somebody who knows the basic theory on Artificial Neural Networks but doesn’t know how to code one. It will be simpler then you expect, trust me.

The article is structured as follows:

  1. Imports and Dataset
  2. Train/Test Split
  3. Defining a Neural Network Model
  4. Model Training
  5. Model Evaluation
  6. Conclusion

It seems like a lot, but I promise – you’ll read it in 10 minutes tops, 15 if you decide to follow along with the code, provided you have necessary libraries installed.

After reading the article you’ll have a basic idea on how to implement an Artificial Neural Network algorithm in the PyTorch library to make predictions on previously unseen data.

Keep in mind that the article doesn’t cover advanced stuff – as those will come in the following articles. So without much ado, let’s get started.


Imports and Dataset

For this simple example we’ll use only a couple of libraries:

  • Pandas: for data loading and manipulation
  • Scikit-learn: for train-test split
  • Matplotlib: for data visualization
  • PyTorch: for model training

Here are the imports if you just want to copy/paste:

import torch
import torch.nn as nn
import torch.nn.functional as F
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split

As for the dataset, the Iris dataset, it can be found on this URL. Here’s how to import it in Pandas directly:

iris = pd.read_csv('https://raw.githubusercontent.com/pandas-dev/pandas/master/pandas/tests/data/iris.csv')
iris.head()

The first couple of rows look like this:

What we want to do now is to change, or remap, values from the Name column to something numeric – let’s say 0, 1, 2. Here’s how to do so:

mappings = {
   'Iris-setosa': 0,
   'Iris-versicolor': 1,
   'Iris-virginica': 2
}
iris['Name'] = iris['Name'].apply(lambda x: mappings[x])

Executing the code from above results in the following DataFrame:

Which means we’re good to proceed!


Train/Test Split

In this section, we’ll use the Scikit-Learn library to do a train/test split. Afterward, we’ll convert split data from Numpy arrays to PyTorch tensors.

Let’s see how.

To start out, we need to split the Iris dataset into features and target – or X and y. The column Name will be the target variable and everything else will be a feature (or predictor).

I will also be using a random seed, so you are able to reproduce my results. Here’s the code:

X = iris.drop('Name', axis=1).values
y = iris['Name'].values
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
X_train = torch.FloatTensor(X_train)
X_test = torch.FloatTensor(X_test)
y_train = torch.LongTensor(y_train)
y_test = torch.LongTensor(y_test)

If you were now to check the first 3 rows from X_train you’d get this:

Same goes for the y_train:

We now have everything needed to create a Neural Networks – let’s do so in the next section.


Defining a Neural Network Model

As for the architecture of the model, it will be very simple. Let’s see how the network will be structured:

  1. Fully Connected Layer (4 input features (number of features in X), 16 output features (arbitrary))
  2. Fully Connected Layer (16 input features (number of output features from the previous layer), 12 output features (arbitrary))
  3. Output Layer (12 input features (number of output features from the previous layer), 3 output features (number of distinct classes))

And that’s pretty much it. Besides that, we’ll use ReLU for our activation function. Let’s see how to implement this in code:

class ANN(nn.Module):
   def __init__(self):
       super().__init__()
       self.fc1 = nn.Linear(in_features=4, out_features=16)
       self.fc2 = nn.Linear(in_features=16, out_features=12)
       self.output = nn.Linear(in_features=12, out_features=3)

 def forward(self, x):
     x = F.relu(self.fc1(x))
     x = F.relu(self.fc2(x))
     x = self.output(x)
     return x

PyTorch uses this object-orientated way of declaring models, and it’s fairly intuitive. In the constructor, you will define all the layers and their architecture, and in the forward() method you will define a forward pass.

As simple as that.

Let’s now make an instance of the model and verify that its architecture matches the one we specified above:

model = ANN()
model

Great. Before we can train the model, there’s a couple of more things we need to declare:

  • Criterion: basically how we measure loss, we’ll use CrossEntropyLoss
  • Optimizer: optimization algorithm, we’ll use Adam with a learning rate of 0.01

Here’s how to implement it in code:

criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)

And now the part we’ve been waiting for – model training!


Model Training

This part will also be extremely simple. We’ll train the model for 100 epochs, keeping track of time and loss. Every 10 epochs we’ll output to the console the current status – indicating on which epoch are we and what’s the current loss.

Here’s the code:

%%time
epochs = 100
loss_arr = []
for i in range(epochs):
   y_hat = model.forward(X_train)
   loss = criterion(y_hat, y_train)
   loss_arr.append(loss)

   if i % 10 == 0:
       print(f'Epoch: {i} Loss: {loss}')

   optimizer.zero_grad()
   loss.backward()
   optimizer.step()

If you’re wondering what these last 3 lines are doing, the answer is simple – backpropagation – ergo updating of weights and biases so the model can actually "learn".

Here’s the result of the above code:

That was fast – please don’t get used to that feeling.

If plain numbers mean absolutely nothing to you, here’s a visualization of our loss (epoch number on the x-axis and loss on the y-axis):

So, we’ve trained the model, but what now? We need to evaluate it on the previously unseen data somehow. Stay here for a minute more and you’ll find out how.


Model Evaluation

In the evaluation process, we want to somehow keep track of predictions made by the model. We’ll need to iterate over the X_test and make a prediction, and then later compare it to the actual value.

We will use torch.no_grad() here because we’re just evaluating – there’s no need to update weights and biases.

Anyway, here’s the code:

preds = []
with torch.no_grad():
   for val in X_test:
       y_hat = model.forward(val)
       preds.append(y_hat.argmax().item())

The predictions are now stored in the preds array. We can now make a Pandas DataFrame with the following 3 attributes:

  • Y: actual value
  • YHat: predicted value
  • Correct: flag, 1 indicating Y and YHat match, 0 otherwise

Here’s the code:

df = pd.DataFrame({'Y': y_test, 'YHat': preds})
df['Correct'] = [1 if corr == pred else 0 for corr, pred in zip(df['Y'], df['YHat'])]

The first 5 rows of the df will look like this:

That’s all great, but how to actually calculate accuracy?

Well it’s simple – we only need to sum up the Correct column and divide it with the length of df:

df['Correct'].sum() / len(df)
>>> 1.0

The accuracy of our model on previously unseen data is 100%. Keep in mind that this is only because the Iris dataset is utterly simple to classify, it is by no means a claim that Neural networks are the best algorithm for this dataset. I’d say NN is an overkill for this type of problem, but that’s a discussion for another time.


Conclusion

And there you have it – the most simple Neural network you’ll ever write – with a perfect and clean dataset, no missing values, fewest layers, and neurons – admit it, it was easy.

The next time it won’t be – as some more advanced concepts will be introduced.

Thanks for reading. Bye.


Loved the article? Become a Medium member to continue learning without limits. I’ll receive a portion of your membership fee if you use the following link, with no extra cost to you.

Join Medium with my referral link – Dario Radečić


Related Articles