The world’s leading publication for data science, AI, and ML professionals.

From Circle to ML via Batman: Part I

Beautiful Equations are much more than just Art!

"M", "L" and the Batman symbol are results of single Inequations each. View Graph
"M", "L" and the Batman symbol are results of single Inequations each. View Graph
Batman Inequation I: View Graph
Batman Inequation I: View Graph

Introduction:

The circle in itself is really pretty, ain’t it? But, with some curiosity, you can go beyond a circle. It doesn’t need any calculus or any advanced concepts, just with our favourite class 10 mathematics and with some beautiful ideas, we can go beyond the circle and even create the Batman Inequation. This article is about creating inequations for whatever shapes we want. In the second part, we will see how we can improve and generalize this framework and derive commonly used functions in Machine Learning.

This is a journey from the curve on the left to the one on the right
This is a journey from the curve on the left to the one on the right

All visualisations created here are generated in openly available software like Desmos and Geogebra. I suggest the readers go ahead and play with these equations using these tools and explore the beauty by themselves.


The potential of a Circle!

The humble circle describes multiple points which lie equidistant from a given centre point. Let’s have a look at our familiar friend in all its forms. When all points lie at a given distance, let’s consider 1 unit, we get a circle. Points with distance less than 1 lie inside the circle and ones with more than 1 lie outside the circle. In 3 dimensions a paraboloid with the equation z=x²+y²–1 forms our circle when it intersects with the XY plane.

Different views of a circle
Different views of a circle

We all know that a circle normally takes the form x²+y²=1. What do you think will happen if we go beyond the power of 2? That’s where the magic lies. Let’s take the values from 2 to 10 and observe what happens.

Powers going from 3 to 10
Powers going from 3 to 10

For even powers, we see the circle starts looking like a square and it does become one at infinite power. But why does it do so and especially at even powers? We will first try to understand how the points lying on the curve (x^n+y^n=1) behave and then look into the nature and properties of points within the curve (x^n+y^n<=1).

Let’s understand odd powers first. If the power is odd, and say x is negative, then the result of x^n term is also negative, thus y^n term takes a value greater than 1 so that the sum remains 1. For large powers, the difference of 2 numbers becomes insignificant with an increase in power. Let’s take an example, if x⁵ is -100,000 then the value of y⁵ should be 100,001 to satisfy the equation. But, we are plotting x and y, so the values of x and y which justify the equation are (-10, 10.0000199..). This is extremely close to the line y=-x. This is also applicable for the case with negative y and positive x. Also note that higher the power, smaller is the deviation from y=-x. This cannot be possible with negative y and negative x, and that’s why we don’t see a part of the function in that quadrant. When x and y both are positive, we see something like a part of a square. This is because if x is significantly smaller than 1, like 0.7, then x⁵ becomes very small very fast (0.168 here). Thus y⁵ has to be 1-x⁵ which is 0.832, which means y will be very close to 1 but slightly less (0.9638) here. The same logic applies the other way around. So, for x values away from 1, y takes values near 1 (like a horizontal edge) and for x values near 1, y drops to 0 fast (like a vertical edge). This makes the curve look like a part of the square. This can be seen below:

Demonstrating nature with n=9.
Demonstrating nature with n=9.

Once odd powers are understood, understanding even powers becomes much easier. The case of negative values now doesn’t exist so the entire function looks like a square in all the quadrants. This can be seen below. This is what we will focus on from here onwards.

Graph for n=10
Graph for n=10

Finally, the circle and the square are not at odds with each other and have set their powers even. With the hope that you square up with these ideas and doubts don’t encircle you, we loop back to mathematics without cutting any corners.

Wait!!! There is a lot left in the store of mathematics. We also know how to shift coordinates. Subtraction for left and addition for right. So we can not only generate a square, but we can also position it anywhere we want. Not only that, but we can also rescale and thus stretch the square and make into a rectangle. Let’s try it:

A rectangle instead: (2(x-3))¹⁰+(3(y-2))¹⁰<=1; View Graph
A rectangle instead: (2(x-3))¹⁰+(3(y-2))¹⁰<=1; View Graph

By now, some of you familiar with deeper mathematics would be able to see how this is related to ideas in Minkowski Distance and F-norm but we will leave them for the future.


Beyond Circles

While going beyond circles and rectangles, we have to take a slightly different perspective on these graphs. This time we look from the perspective of inequations for points within the curve which have a sum lesser than 1. If the sum of 2 positive numbers is less than 1 then both the numbers have to be less than 1. Similarly, if there are many such terms, then all of them have to be as small as possible so that the sum never exceeds 1. Even if one of the terms is greater than 1 then the inequality will not hold. Isn’t this very similar to the idea of intersection? Selected points should lie in all sets (i.e. all inequations should give values close to 0) if they are not in even one of the sets then they are not selected (even if one of the terms is greater than 1 then the inequation does not hold true). Thus, the squares made above can be seen as the region of intersection of two terms, x^(2n)<1 and y^(2n)<1 (referred to as trenches for their shape) as shown below. Higher values of n allow the terms to be as small as possible.

The squares(z = x¹⁰+y¹⁰) as the intersection of 2 trenches made by z=x¹⁰ and z=y¹⁰
The squares(z = x¹⁰+y¹⁰) as the intersection of 2 trenches made by z=x¹⁰ and z=y¹⁰

And now we have taken a humongous step. We can make very complex figures which emerge from such intersections and take our designing skills to the next level.

A diamond as the intersection of diagonal trenches: z = (y-2x)¹⁰+(y+2x)¹⁰
A diamond as the intersection of diagonal trenches: z = (y-2x)¹⁰+(y+2x)¹⁰

We can get back our graph by just making z=1:

The same figure in 2D: (y-2x)¹⁰+(y+2x)¹⁰<=1. Higher the power, better is the approximation of intersection. View Graph
The same figure in 2D: (y-2x)¹⁰+(y+2x)¹⁰<=1. Higher the power, better is the approximation of intersection. View Graph

Let’s try this strategy on something simple:

This is a simple shape with 3 edges and an arc so is easy to make. Expression: ((y+x-1)/2)⁵⁰+((y-x-1)/2)⁵⁰+(y-0.5)⁵⁰+(x²+y²)⁵⁰<=1; View Graph
This is a simple shape with 3 edges and an arc so is easy to make. Expression: ((y+x-1)/2)⁵⁰+((y-x-1)/2)⁵⁰+(y-0.5)⁵⁰+(x²+y²)⁵⁰<=1; View Graph

Remember that the trenches we make have the walls at the following positions: y — f(x)=1 and y — f(x)= -1. This is because all absolute values below 1 tend to zero thus are part of the trench whereas all absolute values greater than one increase very fast thus forming the walls. So we can use the trenches shown below.

We are now completely equipped for making the batman symbol and covering more than half of the journey. The strategy is not to just intersect but to also eliminate regions from the curve to carve out the shape. This is done by taking curves one by one and refining them and their positioning to match the shape. In some places, the curves had to be inverted i.e. the region greater than 1 had to be made less than 1 and vice versa. This was done by changing the sign of the power of the curves. Note that this strategy can be applied to many shapes. All these curves have the property of being greater than 1 on one side (away from symbol) and less than 1 on the other (towards symbol). Thus every section has its own curve which is then combined using the sum of large even powers as described earlier.

The following expressions were used (selected according to the shape of the curve):

  • f1(x,y):(0.5(x-1.16)^(2.8))^(2) +(y+1.6): lower edge of right wing
  • f2(x,y):(0.5(x+1.16)^(2.8))^(2) +(y+1.6): lower edge of left wing
  • f3(x,y):(0.5(y+1.6))^(8)+(x+3): left edge of left wing
  • f4(x,y):(0.5(y+1.6))^(8)+(-x+3): right edge of right wing
  • f5(x,y):y+0.6: upper horizontal line
  • f6(x,y):(3(x+0.45))^(14)-y+1: left curve between head and wing
  • f7(x,y):(3(x-0.45))^(14)-y+1: right curve between head and wing
  • f8(x,y):e^((3(y-0.1)-258.18((1.9x+0.1)(1.9x-0.1))^(1.6))): forms the head and ears

When all these curves are combined the following figure is obtained:

Note that there are extra bits on the sides but the original function is intact.
Note that there are extra bits on the sides but the original function is intact.

To remove the extra bits, the function is cleaned by adding another term which gives values close to 0 near the shape we want and values greater than 1 in places we don’t want. This makes the final figure as:

The above figure is cleaned and has the following equation (the first term is extra); View Graph; What does the first term do? It doesn't kill our curve, so it simply makes it stronger (by eliminating unwanted parts).
The above figure is cleaned and has the following equation (the first term is extra); View Graph; What does the first term do? It doesn’t kill our curve, so it simply makes it stronger (by eliminating unwanted parts).

We can identify all the parts of the curve individually as seen below:

All curves of the form f(x,y)=1 shown with the inequation
All curves of the form f(x,y)=1 shown with the inequation

Conclusion and What’s coming next

We have just obtained a deep understanding of circles and similar inequations with high even powers. We understood why they behave like the intersection of inequation and mastered this by creating our own batman inequation. Everything’s impossible until somebody does it. Well, the batman equation was created about a decade ago so we tried our hand at Batman Inequation. But, we still had to deal with large powers, had to trim our figure and the process still seemed complex. Part II of this blog will remove all these challenges and simplify everything. It will also explain how these ideas are relevant in Machine Learning in the form of Softmax(our classification friend), Softplus (well-known activation function), log-sum-exp(commonly used function and father of Softmax) and other related directions.

The math is the heaviest just before the application. And I promise you, the application is coming!

References

[1] J.S. Grover, Differentiable Set Operations for Algebraic Expressions (2019); Arxiv Symbolic Computations.

[2] K.M. Kending, Single Equations Can Draw Pictures (1991). The College Mathematics Journal 22.2: 134–139.


Related Articles