Implementing Logic Gates using Neural Networks (Part 2)

AND, NAND and XOR gate

Published in

Towards Data Science

6 min readJul 21, 2020

Hello everyone!! Before starting with part 2 of implementing logic gates using Neural networks, you would want to go through part1 first.

From part 1, we had figured out that we have two input neurons or x vector having values as x1 and x2 and 1 being the bias value. The input values, i.e., x1, x2, and 1 is multiplied with their respective weight matrix that is W1, W2, and W0. The corresponding value is then fed to the summation neuron where we have the summed value which is

Now, this value is fed to a neuron which has a non-linear function(sigmoid in our case) for scaling the output to a desirable range. The scaled output of sigmoid is 0 if the output is less than 0.5 and 1 if the output is greater than 0.5. Our main aim is to find the value of weights or the weight vector which will enable the system to act as a particular gate.

Implementing AND gate

AND gate operation is a simple multiplication operation between the inputs. If any of the input is 0, the output is 0. In order to achieve 1 as the output, both the inputs should be 1. The truth table below conveys the same information.

Truth Table of AND gate and the values of weights that make the system act as AND and NAND gate, Image by Author

As we have 4 choices of input, the weights must be such that the condition of AND gate is satisfied for all the input points.

(0,0) case

Consider a situation in which the input or the x vector is (0,0). The value of Z, in that case, will be nothing but W0. Now, W0 will have to be less than 0 so that Z is less than 0.5 and the output or ŷ is 0 and the definition of the AND gate is satisfied. If it is above 0, then the value after Z has passed through the sigmoid function will be 1 which violates the AND gate condition. Hence, we can say with a resolution that W0 has to be a negative value. But what value of W0? Keep reading…

(0,1) case

Now, consider a situation in which the input or the x vector is (0,1). Here the value of Z will be W0+0+W2*1. This being the input to the sigmoid function should have a value less than 0 so that the output is less than 0.5 and is classified as 0. Henceforth, W0+W2<0. If we take the value of W0 as -3(remember the value of W0 has to be negative) and the value of W2 as +2, the result comes out to be -3+2 and that is -1 which seems to satisfy the above inequality and is at par with the condition of AND gate.

(1,0) case

Similarly, for the (1,0) case, the value of W0 will be -3 and that of W1 can be +2. Remember you can take any values of the weights W0, W1, and W2 as long as the inequality is preserved.

(1,1) case

In this case, the input or the x vector is (1,1). The value of Z, in that case, will be nothing but W0+W1+W2. Now, the overall output has to be greater than 0 so that the output is 1 and the definition of the AND gate is satisfied. From previous scenarios, we had found the values of W0, W1, W2 to be -3,2,2 respectively. Placing these values in the Z equation yields an output -3+2+2 which is 1 and greater than 0. This will, therefore, be classified as 1 after passing through the sigmoid function.

A final note on AND and NAND implementation

The line separating the above four points, therefore, be an equation W0+W1*x1+W2*x2=0 where W0 is -3, and both W1 and W2 are +2. The equation of the line of separation of four points is therefore x1+x2=3/2. The implementation of the NOR gate will, therefore, be similar to the just the weights being changed to W0 equal to 3, and that of W1 and W2 equal to -2

Moving on to XOR gate

For the XOR gate, the truth table on the left side of the image below depicts that if there are two complement inputs, only then the output will be 1. If the input is the same(0,0 or 1,1), then the output will be 0. The points when plotted in the x-y plane on the right gives us the information that they are not linearly separable like in the case of OR and AND gates(at least in two dimensions).

XOR gate truth table and plotting of values on the x-y plane, Image by Author

Two possible solution

To solve the above problem of separability, two techniques can be employed i.e Adding non-linear features also known as the Kernel trick or adding extra layers also known as Deep network

XOR(x1,x2) can be thought of as NOR(NOR(x1,x2),AND(x1,x2))

The solution to implementing XOR gate, Image by Author

Weights of the XOR network

Here we can see that the layer has increased from 2 to 3 as we have added a layer where AND and NOR operation is being computed. The inputs remain the same with an additional bias input of 1. The table on the right below displays the output of the 4 inputs taken as the input. An interesting thing to notice here is that the total number of weights has increased to 9. A total of 6 weights from the input layer to the 2nd layer and a total of 3 weights from the 2nd layer to the output layer. The 2nd layer is also termed as a hidden layer.

Weights of the network for it to act as an XOR gate, Image by Author

Talking about the weights of the overall network, from the above and part 1 content we have deduced the weights for the system to act as an AND gate and as a NOR gate. We will be using those weights for the implementation of the XOR gate. For layer 1, 3 of the total 6 weights would be the same as that of the NOR gate and the remaining 3 would be the same as that of the AND gate. Therefore, the weights for the input to the NOR gate would be [1,-2,-2], and the input to the AND gate would be [-3,2,2]. Now, the weights from layer 2 to the final layer would be the same as that of the NOR gate which would be [1,-2,-2].

Universal approximation theorem

It states that any function can be expressed as a neural network with one hidden layer to achieve the desired accuracy

Geometrical interpretation

Linear separability of the two classes in 3D, Image by Author

With this, we can think of adding extra layers as adding extra dimensions. After visualizing in 3D, the X’s and the O’s now look separable. The red plane can now separate the two points or classes. Such a plane is called a hyperplane. In conclusion, the above points are linearly separable in higher dimensions.