Understanding Neural Networks Using Excel
A TL: DR Approach To Become Familiar With Deep Learning
To simplify the concept of convolutional neural networks, I will try to explain what occurs when developing your deep learning model. For more knowledge, I recommend searching online as there are copious amounts of information available (Like this video). This explanation is derived from the fast.ai repository.
This picture of a simple neural network basically represents what is occuring in this example.
Hidden Layer 1
A hidden layer is what transforms the inputs to discern more complex features from the data for the output layer to make a better assessment.
Two filters will represent different shapes–the first filter is designed to detect horizontal edges, the second filter detects the vertical edges. This 3x3 filter is called the convolutional kernel. The filter 1 is activated for the horizontal edges in the input. Conv1 shows the activations of both after taking the 3x3 section of the input and multiplying it by the convolutional kernel. The next picture below gives you a better idea.
*Although this is represented in a 2d array, they are supposed stacked as a Tensor. Where each matrix represents a slice in the tensor. These are all essentially row operations (Linear Algebra) that are taking place.
=SUM(F11:H13*$AD$11:$AF$13) is the Convolution taking place.
This sum will result in an activation number of 3 for that particular 3x3 spot in the input.
Next, we use our non-linearity unit by using RELU as our Activation Function to eliminate the negatives. We can see next that the negatives disappear in the next picture.
Hidden Layer 2
Next, we do another convolution. Conv2 will be the next hidden layer. This will weigh both matrices in Conv1 by taking their sum product. The convolution kernel here will represent a 2X3X3 Tensor.
After using RELU, we have now created our Second Layer.
Max pooling is going half the resolution of the height and width by only taking the maximum of a 2x2 section in Conv2. In the Maxpool matrix, we can see just the maximum value of the 2x2 section of Conv2, which is 33. Pooling is faster to compute than convolutions. Also, it gives you some amount of translation invariance.
Next, we build our fully connected layer by taking all of our activations in Maxpool and give them all a weight. This is done by doing a matrix product. In excel, we will take the SUMPRODUCT of the activations and the weights. So instead of parsing each section in a convolution layer as before, a fully connected layer will (Dense Layer) will perform the classification on the features that were extracted by the convolutional layers and downsampled by the max-pooling layers.
This example only represents one class, meaning one number. We still have to classify the rest of the numbers.