Create Art with Deep Learning using Pytorch and Geometrical Shapes.

While going through this google experiment gallery, I stumbled upon this interesting tool. In this algorithm, the input image is reconstructed with simple geometric shapes like circles, ellipses, triangles, rectangles etc. and in the process, a real image starts looking more like a painting or a cartoon. You can check out examples on this website. The way their algorithm works is by optimizing the pixel-wise mean square error loss between the input image and the constructed image step by step. Each shape is optimized separately in a sequential manner and the algorithm is a hill climb algorithm instead of a gradient-based approach.
I started wondering if this task can be accomplished using a gradient-based algorithm and if all the shapes can be optimized parallelly instead of doing it sequentially. To apply a gradient-based algorithm, the loss function needs to be converted to a form that is differentiable with respect to the parameters and the parameters, in this case, being the positions and the sizes of the geometrical shapes. The MSE loss is not differentiable with respect to these parameters as the transition in the RGB color value from inside to outside the shape is discrete in nature, it is non-continuous and non-differentiable. To convert it to a differentiable form, the following transformation is done.
Let’s start with a 360 360-pixel image, I want to put a circle at the center of the image. Taking the center of the image to be the origin, every pixel is assigned an (R, theta) value with the width of the image being one unit. Below plots show a circle of radius 0.2 centered at the origin where the value is equal to Tanh(5(0.2-R)/0.2)) in the region inside the circle and zero outside. The Tanh activation makes the transition from inside to outside smooth but still preserves its sudden nature. The gradient is present just in the outer 20 percent of the circle and that is where the learning happens! .
Taking an analogy from quantum mechanics, tanh activation transforms the particle-like nature of the geometrical object to wave-like!




Adding a few more parameters to control rotation and scale the image in X and Y directions, we can start constructing ellipses. The below animation shows an example where an ellipse is given as the target image and a single geometric object transforms into an ellipse using gradient descent.


You can notice the boundaries of the ellipse being blurry because that is where the transition is happening and the RGB value is set equal to the value after Tanh activation and the loss in this case the pixel-wise MSE.
Let’s Paint
Now we are all set to start the painting, all we need is a bunch of these circles, a little more math, some colors and Pytorch’s Autograd to do the magic!
Now when I take multiple geometrical objects, there will be overlaps between the objects, so the next thing that needs to be figured out is the ordering of objects in the picture. So the ordering is done on the area of the object, bigger objects go in the background and smaller ones move to the front. I have also added a parameter for the opacity of the objects which can be set before starting the optimization.
Initially, the objects are scattered randomly throughout the image and the RGB values of the objects are set equal to the weighted mean RGB value of the area they occupy in the image where the weight is equal to the weight of the object in the picture, for instance, if the object is in front, the weight will be higher than if it is in the background. If the actual color of the picture at the boundary of the object is the same as that of the object, the object expands in that direction and if it is different it retrieves or contracts from that direction. The boundary of the object acts as a color sensor because the gradient is present only at the edges.



In the above figures, you see an image that has been reconstructed with 500 rotated ellipses. It looks more or less like a painting, the second image is the optimized image, and the third is the image after removing the transition effect at the edges. The opacity in the objects was set to 50%.
The animations below show the optimization process!


Another example below with 1000 objects with 80% opacity.





Fourier Shapes
Till now we have only dealt with circles and ellipses, we can move to more geometrical objects like squares, triangles etc. with the Fourier series. By adding a sufficient number of Fourier series terms to the equation, I can construct any closed shape. Look at a few examples below where I optimize a single geometrical object starting from a circular shape to the desired shape.






Following were the respective target images






The first three images were comparatively simpler to approximate so only ten Fourier series terms were used whereas for the last three images fifty Fourier series terms were used but for the last shape, fifty terms were still not sufficient to approximate the bird to good quality.
Now I can get the equation of the desired shape of the object by saving the weights of the Fourier terms and use them for constructing an image inplace of the ellipses.
Below is an example where I optimize the image with random shapes, each shape gets optimized with its own Fourier coefficients.



But we can fix the Fourier coefficients to get a specific shape like a triangle, square, or a heart. I will share the results with these specific shapes in the next post. Meanwhile, you can access the code here.
I have explained the algorithm in detail in this video. Read more blogs here.