Deep learning

Celebrity Face Generation With Deep Convolutional GANs

Generating new images of faces with DCGANs

NVS Yashwanth

Published in

Towards Data Science

5 min readSep 24, 2020

In this article, I’ll walk you through a fun project where you will implement a DCGAN for face generation.

We will make use of the Large-scale CelebFaces Attributes (celebA) dataset to train our adversarial networks.

If you are not familiar with GANs and how they work, please read this article featured on Towards Data Science.

Generative Adversarial Networks

Understanding the GAN game with MNIST

towardsdatascience.com

Large-scale CelebFaces Attributes (celebA) dataset

CelebFaces Attributes Dataset (CelebA) is a large-scale face attributes dataset with more than 200K celebrity images, each with 40 attribute annotations.

Data can be downloaded from here.

We will be making use of Deep Convolutional GANs. If you want to read about DCGANs, check out this article.

DCGANs (Deep Convolutional Generative Adversarial Networks)

One of the most interesting parts of Generative Adversarial Networks is the design of the Generator network. The…

towardsdatascience.com

Pre-processing and data loading

We don't need the annotations so we will have to crop images. Now, these are color images. Thus, depth is 3 (RGB — 3 color channels).

We can resize & crop the images to a size of 32x32. This can be later converted into tensors.

Note that we user image folder wrapper function here.

Code by the author.

Visualizing our training data

We will now generate a batch of data and visualize the same. Please note that the np.transpose function translates the image dimension by the order specified. For example, an RGB image of shape 3x32x32 gets transposed to 32x32x3 upon calling the following function:

np.transpose(img,(1,2,0))

Our batch size is 128, so instead of plotting all 128 images of a batch, we only plot 20 in this case.

Code by the author.

A batch of training images. Image by the author.

Scaling images

It is important to scale the images as the Tanh function has values in the range -1 to 1, so we need to rescale our training images to a range of -1 to 1. (Right now, they are in a range from 0-1.)

Code by the author.

Helper functions — Convolutional & Transpose Convolutional

To make things easier and simply our code, we will define helper functions for defining our Discriminator & Generator networks.

The reason for these helper functions? Answer — DRY! 😅

Convolution helper function

Note: To read about CNNs, check out the Stanford Notes below.

CS231n Convolutional Neural Networks for Visual Recognition

Course materials and notes for Stanford class CS231n: Convolutional Neural Networks for Visual Recognition.

cs231n.github.io

Code by the author.

Transpose Convolution helper function

Note: If you want to read about transpose convolution, check out the article below.

Transposed Convolution Demystified

Transposed Convolutions is a revolutionary concept for applications like image segmentation, super-resolution etc

towardsdatascience.com

Code by the author.

Discriminator Architecture

We shall now define our discriminator network. The discriminator as we know is responsible for classifying the images as real or fake. Thus this is a typical classifier network.

Code by the author.

Generator Architecture

The generator network is responsible for generating fake images that could fool the discriminator network into being classified as real. Over time the generator becomes pretty good in fooling the discriminator.

Code by the author.

Parameter initialization

We will initialize the weights and biases of our network by sampling random values from a normal distribution. This results in better results. We define a function for the same that takes a layer as an input.

For weights, I used 0 mean and 0.02 standard deviation.

For bias, I used 0.

Now this should be in place replacement so the _(underscore) after the function does the same.

Code by the author.

Loss functions and optimizer

We shall make use of Adam optimizer with a learning rate of 0.002. This is as per the original research paper on DCGANs. You can check it out below.

Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks

In recent years, supervised learning with convolutional networks (CNNs) has seen huge adoption in computer vision…

arxiv.org

Code by the author.

We used BCEwithLogitsLoss(), which combines a sigmoid activation function (we want the discriminator to output a value 0–1 indicating whether an image is real or fake) and binary cross-entropy loss.

Equation of Binar Cross-Entropy Loss. Image by the author.

Training phase

We will define a function for training our model. The parameters would be the Discriminator, Generator, epoch number.

Code by the author.

Plotting losses

Code by the author.

Sample generation

Let us generate a few samples now. It is important that we rescale the values back to the pixel range(0–255).

Code by the author.

And finally, we have our generated faces below. 👀

Well, that is pretty good for a small network aye!

Takeaways

A human face contains multiple features and some of the features are very complicated such as freckles and facial hair. To generate proper images we might need images with high resolution.
Provided with the high-resolution images for training we might need to build a better and deeper model for better accuracies.
Based on the input image we can further increase the depth and number of layers of the model.
Increasing the resolution can surely help us improve the model and capture more features precisely.
The samples generated can be further improved by tweaking the parameters such as learning rate, batch size, and training for more number of epochs. The loss of the generator is fluctuating and is not decreasing.
The CelebA data mostly contains images of different celebrities at different angles and lighting conditions.

Conclusion

We have seen how to generate realistic faces with DCGAN implementation on the celebA dataset. The images generated can be further improved by tuning the hyperparameters. One could also opt for deeper layers than the one here. Doing so would however result in an increased number of parameters which again would take a lot of time to train. Now open your Jupyter Notebook and implement the same. In the next article, I shall walk you through the generation of the Street View House Numbers using the SVHN dataset. See you at the next one. Cheers!

Deep learning

Celebrity Face Generation With Deep Convolutional GANs

Generating new images of faces with DCGANs

Generative Adversarial Networks

Understanding the GAN game with MNIST

Large-scale CelebFaces Attributes (celebA) dataset

DCGANs (Deep Convolutional Generative Adversarial Networks)

One of the most interesting parts of Generative Adversarial Networks is the design of the Generator network. The…

Pre-processing and data loading

Visualizing our training data

Scaling images

Helper functions — Convolutional & Transpose Convolutional

CS231n Convolutional Neural Networks for Visual Recognition

Course materials and notes for Stanford class CS231n: Convolutional Neural Networks for Visual Recognition.

Transposed Convolution Demystified

Transposed Convolutions is a revolutionary concept for applications like image segmentation, super-resolution etc

Discriminator Architecture

Generator Architecture

Parameter initialization

Loss functions and optimizer

Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks

In recent years, supervised learning with convolutional networks (CNNs) has seen huge adoption in computer vision…

Training phase

Plotting losses

Sample generation

Takeaways

Conclusion

Thank you.

Written by NVS Yashwanth