With the field of Deep learning getting hotter by the day and plethora of advanced articles on the web, it’s easy to think of deep learning as some advanced field reserved only for math Ph.D.’s – but let’s prove you wrong.

The field of Deep Learning, at least the practical part, was never easier to get started with – as the number of resources is growing and libraries are getting better.
This article is aimed towards somebody who knows the basic theory on Artificial Neural Networks but doesn’t know how to code one. It will be simpler then you expect, trust me.
The article is structured as follows:
- Imports and Dataset
- Train/Test Split
- Defining a Neural Network Model
- Model Training
- Model Evaluation
- Conclusion
It seems like a lot, but I promise – you’ll read it in 10 minutes tops, 15 if you decide to follow along with the code, provided you have necessary libraries installed.
After reading the article you’ll have a basic idea on how to implement an Artificial Neural Network algorithm in the PyTorch library to make predictions on previously unseen data.
Keep in mind that the article doesn’t cover advanced stuff – as those will come in the following articles. So without much ado, let’s get started.
Imports and Dataset
For this simple example we’ll use only a couple of libraries:
Pandas
: for data loading and manipulationScikit-learn
: for train-test splitMatplotlib
: for data visualizationPyTorch
: for model training
Here are the imports if you just want to copy/paste:
import torch
import torch.nn as nn
import torch.nn.functional as F
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
As for the dataset, the Iris dataset, it can be found on this URL. Here’s how to import it in Pandas directly:
iris = pd.read_csv('https://raw.githubusercontent.com/pandas-dev/pandas/master/pandas/tests/data/iris.csv')
iris.head()
The first couple of rows look like this:

What we want to do now is to change, or remap, values from the Name
column to something numeric – let’s say 0, 1, 2
. Here’s how to do so:
mappings = {
'Iris-setosa': 0,
'Iris-versicolor': 1,
'Iris-virginica': 2
}
iris['Name'] = iris['Name'].apply(lambda x: mappings[x])
Executing the code from above results in the following DataFrame:

Which means we’re good to proceed!
Train/Test Split
In this section, we’ll use the Scikit-Learn
library to do a train/test split. Afterward, we’ll convert split data from Numpy arrays
to PyTorch tensors
.
Let’s see how.
To start out, we need to split the Iris dataset into features and target – or X and y. The column Name
will be the target variable and everything else will be a feature (or predictor).
I will also be using a random seed, so you are able to reproduce my results. Here’s the code:
X = iris.drop('Name', axis=1).values
y = iris['Name'].values
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
X_train = torch.FloatTensor(X_train)
X_test = torch.FloatTensor(X_test)
y_train = torch.LongTensor(y_train)
y_test = torch.LongTensor(y_test)
If you were now to check the first 3 rows from X_train
you’d get this:

Same goes for the y_train
:

We now have everything needed to create a Neural Networks – let’s do so in the next section.
Defining a Neural Network Model
As for the architecture of the model, it will be very simple. Let’s see how the network will be structured:
- Fully Connected Layer (4 input features (number of features in X), 16 output features (arbitrary))
- Fully Connected Layer (16 input features (number of output features from the previous layer), 12 output features (arbitrary))
- Output Layer (12 input features (number of output features from the previous layer), 3 output features (number of distinct classes))
And that’s pretty much it. Besides that, we’ll use ReLU for our activation function. Let’s see how to implement this in code:
class ANN(nn.Module):
def __init__(self):
super().__init__()
self.fc1 = nn.Linear(in_features=4, out_features=16)
self.fc2 = nn.Linear(in_features=16, out_features=12)
self.output = nn.Linear(in_features=12, out_features=3)
def forward(self, x):
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.output(x)
return x
PyTorch uses this object-orientated way of declaring models, and it’s fairly intuitive. In the constructor, you will define all the layers and their architecture, and in the forward()
method you will define a forward pass.
As simple as that.
Let’s now make an instance of the model and verify that its architecture matches the one we specified above:
model = ANN()
model

Great. Before we can train the model, there’s a couple of more things we need to declare:
- Criterion: basically how we measure loss, we’ll use
CrossEntropyLoss
- Optimizer: optimization algorithm, we’ll use
Adam
with a learning rate of0.01
Here’s how to implement it in code:
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)
And now the part we’ve been waiting for – model training!
Model Training
This part will also be extremely simple. We’ll train the model for 100 epochs, keeping track of time and loss. Every 10 epochs we’ll output to the console the current status – indicating on which epoch are we and what’s the current loss.
Here’s the code:
%%time
epochs = 100
loss_arr = []
for i in range(epochs):
y_hat = model.forward(X_train)
loss = criterion(y_hat, y_train)
loss_arr.append(loss)
if i % 10 == 0:
print(f'Epoch: {i} Loss: {loss}')
optimizer.zero_grad()
loss.backward()
optimizer.step()
If you’re wondering what these last 3 lines are doing, the answer is simple – backpropagation – ergo updating of weights and biases so the model can actually "learn".
Here’s the result of the above code:

That was fast – please don’t get used to that feeling.
If plain numbers mean absolutely nothing to you, here’s a visualization of our loss (epoch number on the x-axis and loss on the y-axis):

So, we’ve trained the model, but what now? We need to evaluate it on the previously unseen data somehow. Stay here for a minute more and you’ll find out how.
Model Evaluation
In the evaluation process, we want to somehow keep track of predictions made by the model. We’ll need to iterate over the X_test
and make a prediction, and then later compare it to the actual value.
We will use torch.no_grad()
here because we’re just evaluating – there’s no need to update weights and biases.
Anyway, here’s the code:
preds = []
with torch.no_grad():
for val in X_test:
y_hat = model.forward(val)
preds.append(y_hat.argmax().item())
The predictions are now stored in the preds
array. We can now make a Pandas DataFrame with the following 3 attributes:
Y
: actual valueYHat
: predicted valueCorrect
: flag, 1 indicatingY
andYHat
match, 0 otherwise
Here’s the code:
df = pd.DataFrame({'Y': y_test, 'YHat': preds})
df['Correct'] = [1 if corr == pred else 0 for corr, pred in zip(df['Y'], df['YHat'])]
The first 5 rows of the df
will look like this:

That’s all great, but how to actually calculate accuracy?
Well it’s simple – we only need to sum up the Correct
column and divide it with the length of df
:
df['Correct'].sum() / len(df)
>>> 1.0
The accuracy of our model on previously unseen data is 100%. Keep in mind that this is only because the Iris dataset is utterly simple to classify, it is by no means a claim that Neural networks are the best algorithm for this dataset. I’d say NN is an overkill for this type of problem, but that’s a discussion for another time.
Conclusion
And there you have it – the most simple Neural network you’ll ever write – with a perfect and clean dataset, no missing values, fewest layers, and neurons – admit it, it was easy.
The next time it won’t be – as some more advanced concepts will be introduced.
Thanks for reading. Bye.
Loved the article? Become a Medium member to continue learning without limits. I’ll receive a portion of your membership fee if you use the following link, with no extra cost to you.