
If you thought Machine Learning is the crush that you wouldn’t have guts to talk to, Deep learning is the dad of your crush! Due to the unprecedented advances in hardware and researchers’ appetite for better and bigger models, deep learning is becoming intimidating and elusive by the day. The more research bubbles up everyday, the more it pushes the level of the basic knowledge you should have. So, for all those folks, who’s hesitant to dive straight into the murky and tacky goo-iness of deep learning, I hope this article would boost your confidence. This article will not discuss any mathematics of these models but provide you the conceptual reinforcement to make your journey strong when it’s time to gory mathematics of them.
Fully connected networks (FCNs)
The simplest form of deep networks you’d ever find. It usually has an input layer, an output layer and optionally multiple hidden layers in the middle. Let the following analogy do the explaining.
Analogy: Frankenstein’s Lab
Imagine you’re an assistant at Frankenstein’s laboratory that’s full of quirky equipment. Frankenstein has just asked you to use the following apparatus and use blue, red, black and yellow paint to make purple, dark orange and green. When you pour the colors on the top, depending on how wide the opening of the tubes are, the paint will flow to the series of balls below. There’s a mechanism to change the size of the tube.

This is analogous to how a FCN works. You give it inputs, i.e. a feature vector (e.g. various attributes of a flower) (paint buckets in the example) and then you predict the outcome (e.g. the flower species) (the mixed colors in the example). Predicting the outcome is a series of mathematical computations (involving matrix multiplication, addition, etc.). You might have realized that this equipment has been already configured to the optimal configuration. Reaching the optimal setting is known as training/optimizing a model, which involves input feature vector, predicted label and true label (tweaking the tube width in the example). It also doesn’t need to be just an input layer and an output layer. You could have intermediate (i.e. hidden) layers as well. Here‘s a what a FCN really would look like. (Learn more: Here)

Applications
- Simple classification tasks from structured data e.g. Predicting house prices given house attributes
Autoencoders
Autoencoders are a type of fully-connected networks that only differs in the whey they are used. They start with an input just like FCNs, map it to a smaller hidden representation (called encoding) and finally reconstruct the original input (called decoding).
Analogy: Back in the Frankenstein’s Lab
Say you and your buddy have two of the above gadget and decide to play with them. You combine them so that they share the output. Now it looks like below. Note that the tube configuration on bottom is a mirror image of the top.

You would pour colors from top and let them mix in the middle. Then a "magical particle separator" that can separate different colors will let the colors separate to their original colors. You could even pour a random color on top and let this manifold figure out the random color at the bottom.
This is what happens in Autoencoders. It takes an input (paint in the example), compute a smaller latent representation (mixed colors in the example), and finally derive the original input (the colors on bottom in the example). Here’s what this looks like in a real Autoencoder. (Learn more: Autoencoder)

Applications
- Restoring corrupted images – You can train an Autoencoder to restore images by inputting a corrupted image and asking the model to predict the original image (similar to identifying the random color poured on top).
- Image/data clustering – You can use the smaller latent representation learnt as a feature representation proxy for data allowing you to cluster the data
Convolution neural networks
Ah! the conquerors of computer vision. CNNs are extremely good at processing images. CNNs are made of convolution layers, fully connected layers and optionally pooling layers. A CNN takes in a image which has a height, width and channels (e.g. RGB – red green blue).
Analogy: Robbery at the museum!
There’s an obnoxious criminal on the lose that’s try to break into the museum to steal a diamond. He goes in to the museum and plan the heist. He breaks the floor into a 5×5 grid. Then he walks from one cell to the other, from left to right on each row. Then he’d look at four neighboring cells (i.e. one he’s on, right cell, cell above and top right cell). If he sees some obstacle/artifact in that field of view, he’d shoot a green glow-in-the-dark mark on to the ceiling directly above, and if the diamond is in one of those 4 cells, he’d shoot a red glow-in-the dark mark. If he does this standing at every cell on the floor, he’d have the plan below. With this he can sneak in at night and know exactly where to go even in pitch dark!

This is what the convolution layer of a CNN does. It moves a kernel (what robber wants to map) that sees a small part of an image at a time, over the image (like the robber going to all the cells). And at each position, it will output some value (e.g. whether an obstacle is present or not). This process leads to developing feature maps that provides useful macro-level information of what’s been analysed. Finally, you connect a FCN to the last convolution layer as you need a FCN for any classification/regression task. Here’s what a typical CNN looks like. (Learn more: Convolution Neural Networks).

Applications
- Image classification – Identify the category of an object present in the image
- Object detection – Identify all objects in an images, as well as their positions
Conclusion
We looked at three algorithms; fully-connected networks (FCNs), Autoencoders and convolution Neural Networks (CNNs). Here are the main take-aways.
- A FCN takes in an input feature vector and predicts the correct output class
- An Autoencoder takes in an input, transform that to a smaller representation and reconstruct the original output
- A CNN takes in an image, send it through a series of convolution/pooling layers, finally through a FCN, which predicts the correct object class present in the image.
Want to get better at deep networks and TensorFlow?
Checkout my work on the subject.

[1] (Book) TensorFlow 2 in Action – Manning
[2] (Video Course) Machine Translation in Python – DataCamp
[3] (Book) Natural Language processing in TensorFlow 1 — Packt
New! Join me on my new YouTube channel

If you are keen to see my videos on various machine learning/deep learning topics make sure to join DeepLearningHero.