Demystifying GANs in TensorFlow 2.0

Published in

Towards Data Science

6 min readJun 23, 2019

This tutorial shows you how you can easily implement a Generative Adversarial Network (GAN) in the new TensorFlow Version 2.0. We’ll focus on the basic implementation, which leaves room for optional enhancements. Before we’ll take a closer look at the implementation, we need to understand the idea and theory behind GANs. If you are already familiar with the theory behind GANs you can skip to the implementation part.

The theory behind GANs

The main focus of GAN is to generate data from scratch via an adversarial process. This process consists of two models — the discriminator and the generator. The discriminator learns if a sample from a data distribution is real or fake and the generator is trying to produce fake samples and to trick the discriminator.

“The generative model can be thought of as analogous to a team of counterfeiters, trying to produce fake currency and use it without detection, while the discriminative model is analogous to the police, trying to detect the counterfeit currency.” [1]

Let’s get a bit more theoretical to grasp the fundamental process behind GANs. The discriminator as well as the generator can be thought as multilayer perceptions, which are competing against each other. The generator G(z) is fed with the prior input noise p_z(z), which serves as input for the discriminator later. The output of the discriminator D(x) is a single scalar which represents the probability if the input data x are real or fake samples, produces by the generator. If D(x) is equal to 1 the discriminator is absolute sure that the input is a real sample from the training data and if D(x)≈0 the discriminator know it’s fooled by the generator.

The idea behind generative adversarial networks

The challenge is that we train D to maximize the probability to assign the correct label and we simultaneously train G to minimize the probability to assign the correct label. In literature this is also referred as minmax game.

The blue marked part of the equation defines the probability of being a real image and the red part of being a fake image. The equation above can be also understood as the objective of the discriminator and the equation below as the objective of the generator, which we want to optimize to fool the discriminator better.

The generators input is a sample of a uniform or normal distribution, which we refer to as noise in the following. Afterwards the discriminators is feed with a real image and noise. By calculating the error back through the neural network, it learns how to distinguish between them.

Implementation

If you want to copy and paste or just want to adjust to notebook to your needs, checkout the code on GitHub.

First, we are going to import the basic libraries we need. We’ll use TensorFlow 2.0 for building the network and the adversarial processes, NumPy to generate the noise and matplotlib for saving the images. Ensure also that you are using TensorFlow 2.0 — the code below won’t work with an older version!

import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
from tensorflow import kerasprint(tf.__version__)

Afterwards we need to define some global variables we need throughout the implementation. The parameter BATCH_SIZE defines how many samples are feed in the neural networks at once. We set the BUFFER_SIZE parameter equal to the length of the training data set for perfect shuffling (click here). If you want to use a different batch size or data set, don’t forget to adjust those variables. The OUTPUT_DIR defines the path were later the output images of the generator are stored during training.

BATCH_SIZE = 256
BUFFER_SIZE = 60000
EPOCHES = 300
OUTPUT_DIR = "img" # The output directory where the images of the generator a stored during training

Now we’ll load the MNIST dataset directly from TensorFlow- we have 60.000 images in total with a height and width of 28 pixels (each image in the list represent one picture).

Before we are going to implement the neural networks, the data are passed into a tf.data.Dataset object. A tf.data.Dataset represents a sequence of elements, in which each element contains one or more Tensor object — we’ll use it as an iterator to store our images in batches and loop trough them later. In addition, the images are normalized to be between -1 and 1 (same range which is generated by the uniform distribution).

For building the generator as well as the discriminator the tf.keras model subclassing API is used, which gives us a little bit more flexibility in constructing the model. This means, we are going to implement a class for the discriminator and generator. In the constructor we are going to define the layers of the network and in the call method the forward pass of the model. We are going to input 100-dimensional noise into the network and output a vector of the size 784. Later, we are going reshape the vector back to a matrix with the dimension of 28x28 (the original size of the images). In addition, the generate_noise method is used to create random data points from the uniform distribution.

The objective function described above is nothing else then binary cross entropy. It takes the noise which is feed into the discriminator and only true labels, because the generator thinks that he produces real images.

We are doing the same for the discriminator objective but now we adding the fake and the real loss together. In addition, we add a little bit smoothing to the objective of the real loss, to avoid overfitting.

The discriminator network is implemented in a similar way as the generator. The only difference is that we take as input a 784-dimensional vector (28*28 = 784) and output only one neuron, which tells us if the input was a fake or a real image.

As optimizer we will use the RMSprop optimizer. It’s just an arbitrary decision I made — feel free to use every other optimizer you want.

generator_optimizer = keras.optimizers.RMSprop()
discriminator_optimizer = keras.optimizers.RMSprop()

The next step is to define what should happen in one single training step. By using the gradients of the generator and discriminator we are training both networks simultaneously . First, some noise is generated according to the batch size we defined before. Afterwards we’re feeding the real image as well as the fake image into the discriminator and calculate its lost. The last step is to do the same for the generator and to apply those gradients.

The last step is to define a function, which describes the whole training. Thanks to the tf.data.Dataset object we can easily iterate over our data. The last function saves all generated images every 50th epoch to the directory you defined in the variable OUTPUT_DIR.

As you can see in my jupyter notebook I ran the training for 300 epoches and achieved quite good results. I think with a bit more fine tuning and a deeper generator as well as discriminator you can actually achieve better results, when it comes down to simple datasets like the MNIST.

Exemplary results of the implemented GAN

Keep in mind this one of the simplest GAN implementations you can imagine. You would achieve much better results by using convolutions in the discriminator as well as generator. You can checkout the tutorial provided by the TensorFlow team that uses convolutions instead of dense layers (click here).

Link to the Jupyter Notebook: https://github.com/MonteChristo46/GAN-Notebooks

References

Goodfellow et al. (2014): Generative Adversarial Networks. arXiv:1406.2661

Demystifying GANs in TensorFlow 2.0

The theory behind GANs

Implementation

References

Written by Daniel