Deep Learning

What are Generative Adversarial Networks (GANs)?
Designed by Ian Goodfellow and his colleagues in 2014, GANs consist of two neural networks that are trained together in a zero-sum game where one player’s loss is the gain of another.
To understand GANs we need to be familiar with generative models and discriminative models.
Generative models try to output new data points using the distribution from the training set. These models generate new data instances. These models capture joint probability p(X, Y)
Types of Generative models1. Explicit density models
- Implicit density models
Explicit density models define an explicit density function while implicit density models define a stochastic procedure that can directly generate data.
If you are interested in reading more about generative models, check out this popular GitHub repository below.
On the other hand, Discriminative models capture conditional probabilities p(X/Y) and they differentiate different data instances.

Generative models solve difficult tasks. The level of attention detail is more compared to that in discriminative models. Simply speaking generative models do more work. Generative models try to approximate the real data distribution as closely as possible.
In the figure above we can see that the discriminative model tries to separate 0’s and 1’s data space. Whereas the generative models closely approximate the 0’s and 1’s data space.
Now that you know the basic definitions of generative and discriminative models, let us learn about GANs.
The Discriminator & Generator networks – The GAN Game

Generative Adversarial Networks (GANs) are generative models. They generate whole images in parallel. GANs consist of 2 networks: Discriminator & Generator networks

GANs use a differentiable function. This is usually a neural network. We call it the generator network. This generator network takes random inputs. These inputs are noise. This noise is given to a differentiable function that transforms and reshapes the same into a recognizable structure. This could be an image and the same is highly dependent on the noise at the input of the differentiable function.
For various noise inputs, we can generate many images. However, the generator network immediately doesn’t start giving out realistic images. We need to train it.
How do we train this generator network? Probably the same way as any other network? Actually no!
Generator networks see many images and try to output something similar to the same probability distribution. How is that done? 👀
Here comes the Discriminator, a regular neural network classifier. The discriminator guides our generator network.
For the sake of simplicity let us call the output images of the generator network as fake images. The output of the generator, the fake images, are given to the discriminator as the input. The Discriminator also sees so-called real images from the training data. The discriminator then outputs the probability that the input is a real image. So a 1 for real images, and 0 for fake images. Meanwhile, the generator also tries to output images that could be assigned a probability of 1 by the discriminator.
Most machine learning models try to minimize some cost function by optimizing the parameters. If we were to assign cost functions to GANs, we can say the cost for the discriminator is negative of the cost of the generator and vice versa.
So let us try to understand how GANs work by assuming discriminator and generators as 2 players and a say a function f.
The generator tries to decrease the output value of the function f, while the discriminator tries to increase it. Let us assume this is done until we reach an equilibrium where neither the generator can decrease the output value of the function f, nor the discriminator can increase it. Since we use 2 optimization algorithms simultaneously, one for the generator and the other for the discriminator, we may never reach an equilibrium. Adam optimizer is a good choice.
Briefly speaking, the generator and discriminator compete, where the generator gives fake data to the discriminator. The discriminator which also sees training data, predicts if the received image is real or fake.
Look at this example below from google developers’ machine learning crash course.
The generator begins with unrealistic images and quickly learns to fool the discriminator.

Thus, the generator is trained over time to fool the discriminator to make it look like the fake images are much like the real ones that the discriminator sees.
So how does the training process look like?
During the training process of the discriminator, it is shown real images and uses computes the discriminator loss. It classifies both real and fake images from the generator and discriminator loss penalizes the discriminator if any image is incorrectly classified. Through backpropagation, the discriminator updates its weights.
Similarly, the generator is given noisy inputs to generate fake images. These images are given to the discriminator and the generator loss penalizes the generator for producing a sample that the discriminator network classifies as fake. Weights are updated through backpropagation right from the discriminator into the generator.
It is important to note that the generator must be constant during the discriminator training phase. Similarly, the discriminator remains constant during the generator training phase. Thus GAN training proceeds in an alternating fashion.
MNIST GAN
In this section, we will learn to design a GAN that can generate new images of handwritten digits. We will use the famous MNIST dataset. Get it here.
The Discriminator architecture
The discriminator is going to be a typical linear classifier.
The activation function we will be using is Leaky ReLu.

Why leaky ReLu? We should use a leaky ReLU to allow gradients to flow backward through the layer unhindered. A leaky ReLU is like a normal ReLU, except that there is a small non-zero output for negative input values.
The Generator architecture
The generator uses latent samples to make fake images. These latent samples are vectors which are mapped to the fake images. A **** latent vector is just a compressed, feature-level representation of an image!
To understand what is a latent sample, consider an autoencoder. The outputs that connect the encoder and decoder portion of a network are made up of a compressed representation that could also be referred to as a latent vector.
The activation function for all the layers remains the same except we will be using Tanh at the output.

Why Tanh at the output? The generator has been found to perform the best with 𝑡𝑎𝑛ℎtanh for the generator output, which scales the output to be between -1 and 1, instead of 0 and 1.
Scaling images
We want the output of the generator to be comparable to the real images pixel values, which are normalized values between 0 and 1. Thus, we’ll also have to scale our real input images to have pixel values between -1 and 1 when we train the discriminator. This will be done during the training phase.
Generalization
To help the discriminator generalize better, the labels are reduced a bit from 1.0 to 0.9. For this, we’ll use the parameter smooth; if True, then we should smooth our labels. In PyTorch, this looks like:
labels = torch.ones(size) * 0.9
We also made use of dropout layers to avoid overfitting.
Loss calculation
The discriminator’s goal is to output a 1 for real and 0 for fake images. On the other hand, the generator wants to make fake images that closely resemble the real ones.
_Thus we can say if "D" represents the loss for the discriminator, then the following can be stated: The goal of discriminator : D(real_images)=1 & D(fake_images)=0 The goal of generator: D(real_images)=0 & D(fakeimages)=1
We will use BCEWithLogitsLoss, which combines a sigmoid activation function (we want the discriminator to output a value 0–1 indicating whether an image is real or fake) and binary cross-entropy loss.

Training
As mentioned earlier, Adam is a suitable optimizer.
The generator takes in a vector z and outputs fake images. The discriminator alternates between training on the real images and that of the fakes images produced by the generator.
Steps involved in discriminator training:
- We first compute the loss on real images
- Generate fake images
- Compute loss on fake images
- Add the loss of the real and fake images
- Perform backpropagation and update weights of the discriminator
Steps involved in generator training:
- Generate fake images
- Compute loss on fake images with inversed labels
-
Perform backpropagation and update the weights of the generator.
Training loss
We shall plot generator and discriminator losses against the number of epochs.

Samples generated by the generator
At the start

Overtime

This way the generator starts out with noisy images and learns over time.
You can check out the code and readme file on my GitHub profile as well.
Conclusions
Since the time Ian Goodfellow and his colleagues at the University of Montreal designed GANs, they exploded with popularity. The number of applications is remarkable. GANs were further improved by many variations some of which are CycleGAN, Conditional GAN, Progressive GAN, etc. To read more about these check out this link. Now open a Jupyter notebook and try to implement whatever you learned.