The world’s leading publication for data science, AI, and ML professionals.

Conditional and Controllable Generative Adversarial Networks

Understanding Conditional and Controllable GANs And Implementing CGAN In TensorFlow 2.x

Photo by Mr TT on Unsplash
Photo by Mr TT on Unsplash

In this article, we will be looking at conditional and controllable GANs, what’s their need, and how to implement Naive conditional GAN using TensorFlow 2.x. Before you read further, I would like you to be familiar with DCGANs, which you can find here.

Why Conditional GAN

Till now, the generator was generating images randomly, and we had no control over the class of image to be generated i.e while training GAN, the generator was generating a random digit each time i.e it may generate one, six, or three, we don’t know what it will generate. But will conditional GANs we can tell the generator to generate an image of one or six. This is where conditional GAN becomes handy. With conditional GAN you can generate images of the class of your choice.

How does it work?

Till now, we were feeding images as an only input to our generator and discriminator. But now we will be feeding class information to both the networks.

  1. The generator takes random noise and a one-hot encoded class label as input. And outputs a fake image of a particular class.
  2. The discriminator takes an image with one-hot labels added as depth to the image(channels) i.e if you have an image of 28 28 1 size and one-hot vector of size n than image size will be 28 28 (n+1).
  3. Discriminator outputs whether the image belongs to that class or not i.e real or fake.

Code

The code for this article is almost the same as that of DCGAN, but with some modifications. Let us see those differences.

Note: Following Implementation is a Naive way and is very slow. You can refer here for a much better way of coding conditional GANs.

Combining Images and Labels

  1. First, we load the MNIST dataset and normalize the images.
  2. Then, we define an add_channels function that takes an image and corresponding one-hot label as inputs. And outputs image with additional depth channels that represent a one-hot label. Out of all the depth channels that were added, only one channel contains a value of one, and all other channels contain zero values.
  3. First, we iterate over all the images and corresponding labels. For each digit in the one-hot label, we create a vector of image shape that contains every value equal to that digit. After that, we stack these vectors with the image.
  4. Here, we have 10 classes that’s why we have looped over 10 digits in a one-hot label.

    Combining Noise and labels

  5. The following code combines noise vector with one-hot labels and other function outputs the input dimensions for generator and discriminator.
  6. This is needed because our models are coded using sequential API and we need to explicitly pass the input dimensions.

    Training Loop

  7. The training loop is also the same as DCGAN, but this time we are combining noise and labels for the generator. And also doing images processing for fake images before feeding them into discriminator.

    The above implementation code for condition GANs is very slow but it works well and is good for understanding the concept.


Controllable GANs

Conditional GANs help generate images of classes of our choice. But we don’t have control over the content of the output image i.e what if we want a dog with RED HAT or Glasses. Well, this is where controllable GANs come into the picture. But how it works? so let us find out.

Concept

Controllable GANs are useful for getting features of choice in generated images. For example, if we want to generate an image of a person with black hair and green eyes, then you need to tweak the input noise.

  1. When you feed a generator with a random noise vector, elements in this noise vector corresponds to some feature in the generated image.
  2. When you change an element in the noise vector, it changes some features in the image, like some change in the vector may change the color of the hair or eyes of a person.
  3. This can be done by mapping vectors with respected features in some vector space.
  4. This can be performed by using a pre-trained classifier that tells if a particular feature is present in a generated image or not, like if a person’s eyes are green or not. This can be used to find noise vectors for different features.

Challenges to controllable GANs

There are two main challenges faced by controllable GANs, that are as follows.

  1. Correlation b/w features: When we change an element of the noise vector in order to change one feature(like adding facial hair to an image of a female) leads to a change in other feature (like changing the gender of a person in an image). This may be because models have seen facial hair with a masculine face.
  2. Vector space entanglement: When elements in noise vector are entangled while corresponding to features.

Conclusion

Conditional GANs are used for generating images belonging to classes of our choice, while Controllable GANs are used for controlling features in images.


You can find the full code for this article here. Stay tuned for upcoming articles where we will be implementing more algorithms.

So, this concludes this article. Thank you for reading, hope you enjoyed and were able to understand what I wanted to explain. Hope you read my upcoming articles. Hari Om…🙏

References

Generative Adversarial Networks (GANs)


Related Articles