Welcome to the practical implementation guide of our Deep Learning Illustrated series. In this series, we bridge the gap between theory and application, bringing to life the neural network concepts explored in previous articles.
In today’s article, we’ll build a Convolutional Neural Network (CNN) using TensorFlow. Be sure to read the previous CNN article, as this one assumes you’re already familiar with the inner workings and mathematical foundations of a CNN. We’ll be focusing only on implementation here, so prior knowledge will help you follow along more easily.
Deep Learning Illustrated, Part 3: Convolutional Neural Networks
We’ll create the same simple image classifier that predicts whether a given image is an ‘X’ or not.

And we’ll break down each step in detail along the way to ensure you understand both the how and the why!
Step 1: Importing the necessary libraries
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense
from tensorflow.keras.optimizers import Adam
import numpy as np
import matplotlib.pyplot as plt
TensorFlow and Keras (which is a high-level API within TensorFlow) will handle the creation and training of our CNN, while NumPy and Matplotlib will help us with data manipulation and visualization.
NOTE: To ensure that our results are consistent each time we run the code, we’ll set a random seed:
# Setting seed for reproducibility
np.random.seed(42)
tf.random.set_seed(42)
Setting a seed ensures consistent results by making sure the random processes in the code run the same way each time. Think of it like shuffling a deck of cards in exactly the same order every time we play.
Step 2: Understanding and Generating the Data
Let’s first generate the images that our model will learn to classify. Previously we saw that an ‘X’ can be represented by a 5×5 pixel image like so:

Let’s translate this to code:
# 'X' pattern
def generate_x_image():
return np.array([
[1, 0, 0, 0, 1],
[0, 1, 0, 1, 0],
[0, 0, 1, 0, 0],
[0, 1, 0, 1, 0],
[1, 0, 0, 0, 1]
])
This function generates a simple 5×5 image of an ‘X’. Next, we’ll create a function that generates random 5×5 images that do not resemble an ‘X’:
def generate_not_x_image():
# Ensuring not to generate an 'X' pattern
while True:
img = np.random.randint(2, size=(5, 5))
if not np.array_equal(img, generate_x_image()):
return img
Step 3: Building the Dataset
With our functions ready, we can now create a dataset of 1,000 images. We’ll label them accordingly, with 1 for images of an ‘X’ and 0 for those that are not:
# Create a dataset
num_samples = 1000
images = []
labels = []
for _ in range(num_samples):
if np.random.rand() > 0.5:
images.append(generate_x_image())
labels.append(1)
else:
images.append(generate_not_x_image())
labels.append(0)
images = np.array(images).reshape(-1, 5, 5, 1)
labels = np.array(labels)
This code generates 1,000 images, half of which contain an ‘X’ and the other half don’t. We then reshape the images to ensure they have the correct dimensions for our CNN.
To train our model effectively, we’ll split this dataset into training and testing sets:
# Split the data into training and testing sets
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(images, labels, test_size=0.2, random_state=42)
This split reserves 80% of the data for training the model and 20% for testing it. The test set helps us evaluate how well the model performs on new, unseen data.
Before we dive into model building, let’s take a look at some of the images in our dataset to understand what we’re working with.
# Function to display images
def display_sample_data(images, labels, num_samples=5):
plt.figure(figsize=(10, 2))
for i in range(num_samples):
ax = plt.subplot(1, num_samples, i + 1)
plt.imshow(images[i].reshape(5, 5), cmap='gray_r')
plt.title(f'Label: {labels[i]}')
plt.axis('off')
plt.show()
This function displays images from our training set, helping us confirm that the data is correctly labeled and formatted.
# Display first 5 samples of our training data
display_sample_data(x_train, y_train)

Step 4: Building the CNN Model
Now that our data is ready, let’s build the CNN! Here’s the architecture we used previously:
1 – Convolutional Layer: Applies four 3×3 filters to an input image to detect features and creates four feature maps

2 – Max-Pooling Layer: Reduces the dimensions of the feature maps, making the model more efficient
3 – Flatten Layer: Converts the 2D data into a 1D array, preparing it for the neural network

4 – Hidden Layer: A fully connected hidden layer with three neurons all with ReLU activation functions
5 – Output Layer: A single neuron with a sigmoid activation function

model = Sequential([
# 1 - Convolutional Layer
Conv2D(4, (3, 3), activation='relu', input_shape=(5, 5, 1)),
# 2 - Max-Pooling Layer
MaxPooling2D(pool_size=(2, 2)),
# 3 - Flatten Layer
Flatten(),
# 4 - Hidden Layer
Dense(3, activation='relu'),
# 5 - Output Layer
Dense(1, activation='sigmoid')
])
Step 5: Compiling the Model
Compiling the model is crucial as it defines how the model will learn. Here’s what each part of the compile function does:
- Optimizer (Adam): The optimizer adjusts the model’s weights to minimize the loss function. We use Adam here, but this is a list of other optimizers that can be used.
- Loss Function (Binary Crossentropy): Measures how far off the model’s predictions are from the actual results. Since we’re dealing with binary classification (X or not-X), binary cross entropy is appropriate. The loss value is what the model tries to minimize during training. Lower loss values indicate better performance (i.e., the model’s predictions are closer to the actual values). The loss directly influences how the model is trained and the optimizer uses it to update the model’s weights in the direction that minimizes the loss.
- Metrics (Accuracy): Metrics are used to evaluate the model’s performance. We use accuracy to track the proportion of correct predictions out of all predictions made. Metrics are additional measures that evaluate the performance of the model but are not used by the optimizer to adjust the model during training. They provide a way to assess how well the model is performing.
Note: While accuracy is a common metric, it’s not always the most reliable, especially in certain scenarios like imbalanced datasets. However, for simplicity, we’ll use it here. If you’re interested in exploring other evaluation metrics that might provide a more nuanced view of model performance, this article covers some alternatives.
model.compile(optimizer=Adam(), loss='binary_crossentropy', metrics=['accuracy'])
Step 6: Training the Model
Training the model involves feeding it the training data multiple times (epochs) so it can learn to make better predictions. An epoch is one complete pass through the entire training dataset. We’ll train our model for 10 epochs:
history = model.fit(x_train, y_train, epochs=10, validation_data=(x_test, y_test))

Step 7: Evaluating the Model
After training, we evaluate the model’s performance on the test data:
loss, accuracy = model.evaluate(x_test, y_test)
print(f'Test Accuracy: {accuracy * 100:.2f}%')

The accuracy metric tells us that 94.8% of the images in the test data were correctly classified.
Step 8: Visualizing the Training Process
Finally, let’s visualize how the model’s accuracy changed over the epochs. This helps us understand how well the model learned during training:
plt.plot(history.history['accuracy'], label='accuracy')
plt.plot(history.history['val_accuracy'], label='val_accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()
plt.show()

And that’s it. We’ve built a simple convolutional neural network to predict if an image is an ‘X’ or not using TensorFlow in less than 5 minutes!
As always, feel free to connect with me on LinkedIn for any comments/questions!
Note: Unless specified, all images are by the author.