The world’s leading publication for data science, AI, and ML professionals.

How Number of Hidden Layers Affects the Quality of Autoencoder Latent Representation

Hyperparameter tuning in autoencoders – Part 1

Photo by Clark Van Der Beken on Unsplash
Photo by Clark Van Der Beken on Unsplash

Introduction

You may already know that autoencoder latent representation includes the most important features of the input data when its dimension is significantly lower than the dimension of the input data.

The quality of the Autoencoder latent representation depends on so many factors such as number of hidden layers, number of nodes in each layer, dimension of the latent vector, type of activation function in hidden layers, type of optimizer, learning rate, number of epochs, batch size, etc. Technically, these factors are called autoencoder model hyperparameters.

Obtaining the best values for these hyperparameters is called hyperparameter tuning. There are different hyperparameter tuning techniques available in machine learning. One simple technique is manually tuning one hyperparameter (here, number of hidden layers) while keeping other hyperparameter values unchanged.

Today, in this special episode, I will show you how the number of hidden layers affects the quality of autoencoder latent representation.

The dataset we use

We will use the MNIST dataset (see Citation at the end) to build the autoencoder models here.

Approach

We will build three autoencoder models with different architectures that only depend on the number of hidden layers while other hyperparameter values in the models remain unchanged. I will use several visualization techniques to verify the results.

Autoencoder with one hidden layer

(Image by author)
(Image by author)

The above autoencoder architecture can be coded as follows.

(Image by author)
(Image by author)

Now, we can train the model.

(Image by author)
(Image by author)

Now, we can visualize the compressed MNIST digits after encoding.

The output of autoencoder with one hidden layer (Image by author)
The output of autoencoder with one hidden layer (Image by author)

This output is not clear enough to distinguish between each digit in the MNIST dataset because a single hidden layer is not enough to capture most of the complex non-linear patterns (relationships) in the MNIST data.

We can compare the above output with the original MNIST digits.

Original MNIST digits (Image by author)
Original MNIST digits (Image by author)

Now, we can visualize test MNIST data in the latent space to see how the autoencoder model with one hidden layer is capable of distinguishing between the nine digits.

Visualize the test MNIST data in the latent space (Image by author)
Visualize the test MNIST data in the latent space (Image by author)

The digits do not have clear separate clusters in the latent space. It means that the autoencoder model with only one hidden layer cannot clearly distinguish between the nine digits in the test MNIST data.

As a solution for this, we can increase the number of hidden layers in the autoencoder model.

Autoencoder with two hidden layers

(Image by author)
(Image by author)

The above autoencoder architecture can be coded as follows.

(Image by author)
(Image by author)

Now, we can train the model as previously.

(Image by author)
(Image by author)

Now, we can visualize the compressed MNIST digits after encoding.

The output of autoencoder with two hidden layers (Image by author)
The output of autoencoder with two hidden layers (Image by author)

This output is much better than the previous output because two hidden layers can capture a significant amount of complex non-linear patterns (relationships) in the MNIST data. But, the result is still far away from the original MNIST digits.

Original MNIST digits (Image by author)
Original MNIST digits (Image by author)

Now, we can visualize test MNIST data in the latent space to see how the autoencoder model with two hidden layers is capable of distinguishing between the nine digits.

Visualize the test MNIST data in the latent space (Image by author)
Visualize the test MNIST data in the latent space (Image by author)

As compared to the previous output, some digits have clear separate clusters in the latent space and some do not. It means that the autoencoder model with two hidden layers can distinguish between the nine digits in the test MNIST data to some extent, but not perfectly!

As a solution for this, we can further increase the number of hidden layers in the autoencoder model.

Autoencoder with three hidden layers

(Image by author)
(Image by author)

The above autoencoder architecture can be coded as follows.

(Image by author)
(Image by author)

Now, we can train the model as previously.

(Image by author)
(Image by author)

Now, we can visualize the compressed MNIST digits after encoding.

The output of autoencoder with three hidden layers (Image by author)
The output of autoencoder with three hidden layers (Image by author)

This output is much better than the previous two outputs because three hidden layers can capture most of the complex non-linear patterns (relationships) in the MNIST data. In addition to that, this output is close enough to the original MNIST digits, but it is not perfect!

Original MNIST digits (Image by author)
Original MNIST digits (Image by author)

Now, we can visualize test MNIST data in the latent space to see how the autoencoder model with three hidden layers is capable of distinguishing between the nine digits.

Visualize the test MNIST data in the latent space (Image by author)
Visualize the test MNIST data in the latent space (Image by author)

As compared to the previous two outputs, most of the digits have clear separate clusters in the latent space. It means that the autoencoder model with three hidden layers can clearly distinguish between the nine digits in the test MNIST data.

Conclusion

You may try to further increase the number of hidden layers in the autoencoder model. When you do so, as the model becomes more complex with many hidden layers, it may result in overfitting that significantly reduces the quality of autoencoder latent representation. To mitigate overfitting in autoencoder models, you can try out different regularization techniques such as dropout or early stopping.

To conclude this article, I’d like to show you the outputs of the three autoencoder models in one picture as compared to the original MNIST data.

Autoencoder outputs in one picture (Image by author)
Autoencoder outputs in one picture (Image by author)

The top row represents the original MNIST data. As we increase the number of hidden layers in the autoencoder model one by one, the quality of autoencoder latent representation increases! We can consider further increasing the number of hidden layers, but that may result in overfitting as I mentioned earlier.


This is the end of today’s post.

Please let me know if you’ve any questions or feedback.

Read next (Recommended)

An Introduction to Autoencoders in Deep Learning

  • How the Dimension of Autoencoder Latent Vector Affects the Quality of Latent Representation

How the Dimension of Autoencoder Latent Vector Affects the Quality of Latent Representation

  • How Autoencoders Outperform PCA in Dimensionality Reduction

How Autoencoders Outperform PCA in Dimensionality Reduction

Click on the image to access my Neural Networks and Deep Learning Course (Screenshot by author)
Click on the image to access my Neural Networks and Deep Learning Course (Screenshot by author)

Support me as a writer

I hope you enjoyed reading this article. If you’d like to support me as a writer, kindly consider signing up for a membership to get unlimited access to Medium. It only costs $5 per month and I will receive a portion of your membership fee.

Join Medium with my referral link – Rukshan Pramoditha

Thank you so much for your continuous support! See you in the next article. Happy learning to everyone!


MNIST dataset info

  • Citation: Deng, L., 2012. The mnist database of handwritten digit images for machine learning research. IEEE Signal Processing Magazine, 29(6), pp. 141–142.
  • Source: http://yann.lecun.com/exdb/mnist/
  • License: Yann LeCun (Courant Institute, NYU) and Corinna Cortes (Google Labs, New York) hold the copyright of the MNIST dataset which is available under the Creative Commons Attribution-ShareAlike 4.0 International License (CC BY-SA). You can learn more about different dataset license types here.

Rukshan Pramoditha 2022–08–23


Related Articles