Power of a Single Neuron

Vaibhav Sahu
Towards Data Science
4 min readJun 29, 2018

--

Basic Unit of a Artificial Neural Network — Artificial Neuron

A Neural Network is combinations of basic Neurons — also called perceptrons (A basic Unit shown in the above diagram- green circle in middle) arranged in multiple layers as a network (below diagram). To understand the working and power of a large network, first we need to understand the working and power of a single unit. That is what we will focus in this article!

A Network of Neurons

Best way to understand working of any algorithm is to try to code it by yourself. If you can write a simple code which can create clusters of data by calculating the centroids in each iteration, you know k-means. If you can write a simple code to create multiple decision trees on a subsample of data and take the maximum voting from each tree to classify a data point, you know random forest. Similarly if you can write a simple code to solve simple linear equation, two linear equations and multiple linear equations by using gradient descent — you understand neural networks and gradient descent.

Gradient Descent

The back bone of a neural network is Gradient Descent. To write a code for Gradient Descent we need to understand it first.

Lets say we have a simple linear equation to find solutions for

Here we need to find value for w1 and w2 which make the equation true for values given for y, x1 and x2. If we simply guess values of w1 and w2 we will get y_hat

We can calculate the error we generated because of guessing the values of w1 and w2 by

Cost Function

which we call generally, Cost. Now, our objective is to find out values of w1 and w2 such that Cost C is minimum. Cost C is a differentiable function with respect to w1 and w2. From some calculus refresher, if we differentiate (take derivative of) function C wrt w1 and w2 and equate to 0, we will get the value of w1 and w2 where cost will reach at minima.

Gradients

These derivative are called Gradients. Basically a Derivative of a function at some point is a tangent or Gradient of that function at that point. Looking at the above equations we can not get the value of w1 and w2, where cost derivatives will be 0, as they are dependent on both w1 and w2.

Now to reach at the minima we will start taking small steps towards the direction of minima (which means opposite the direction of Gradient) by updating weights by a small amount each time. In short we are descending in opposite direction of tangent or Gradient — that is why the name “Gradient Descent”.

Gradient Descent
Pictorial Representation of Gradient Descent (Image Source: https://scipython.com/blog/visualizing-the-gradient-descent-method/)

Handcoding Single Neuron to Solve Simple Linear Equation

Usinng numpy that implements Gradient Descent

Epoch are iterations in which we take small steps towards towards Cost minima. Learning Rate tell how small step you want to take. A large step will never let you reach minima which very small steps will take too much time to reach cost minima.

Testing the function by having x1 = 3 , x2 = 8 and y = 41

Results

One can argue that a single equation can have multiple solutions — Neuron will find solution which is closest to the start guess.

Single Neuron to Solve Two Linear Equations

The same function can be modified to solve two equations. This time cost function will be

Cost Vs Epochs

Single Neuron to Solve Multiple Linear Equations

The same function can be modified to solve multiple equations. This time cost function will be

Cost function for Multiple Equations

if error is increasing with each epoc reduce the learning rate. if error decreases but does not get reach below your threshold increase the number of epochs (n_epochs).

This article simply explains Gradient Descent Algorithm, how a basic neuron can be created using numpy to solve linear equations.

All the above codes can be found at my git repo.

Comments and feedbacks are most welcome!

--

--

Strategic Cloud Engineer| Google Certified Professional Machine Learning Engineer| Interested in Artificial Intelligence, Deep Learning