Linear Regression (Part 2): Implementation in python, an example from scratch.

Chethan Kumar GN
Towards Data Science
5 min readSep 17, 2018

--

Machine Learning implementation example in 5 minutes. Implement a multiple linear regression model in python( Part 3).

Linear Regression

Prerequisite: Linear Regression (Part 1)

Since the theory is discussed previously, let us now kick off with actual implementation. There are different datasets available but let us implement it with our own data sets. A simple linear regression has dependent variables (Y) and independent variables (X).

Step-1: Prepare the datasets

dataset to be used
  • X — House size from 1K sq feet to 10K sq feet.
  • Y — Cost of the house from 300K to 1200K.

So our data set is ready and since it is a simple linear regression we have only one-factor size of the house affecting the price of the house. In case of multiple linear regression, we would have had more factors affecting house price like locality, the number of rooms etc.(which will be implemented in next part of Linear Regression ie. part 3).

Plotted data on a graph

Now we need to find the regression line(a line which fits best in the above scatter plot so that we can predict the response y(ie. cost of the house) for any new values of x(ie. size of the house).

Step-2: Let’s start coding!!!

Import the required libraries

  1. Numpy: NumPy is a library for the Python programming language. We use it in machine learning because we have to deal with large data in machine learning and this is faster than the normal array. (Get used to numpy arrays, we will use it everywhere). For installation instructions click here.
  2. Matplotlib: Matplotlib is a plotting library for the Python programming language and its numerical mathematics extension NumPy. Basically, this helps in plotting of graphs.

Here we have used “numpy as np” and “matplotlib.pyplot as plt” it is done to rename the huge names to something smaller(ease of ). Instead of writing 1.numpy.array() short form as np.array()
2.matplotlib.pyplot.plot(x,y)
short form as plt.plot(x,y), these are for ease of coding.

How do you know if you have installed correctly?? run the file with the above two lines if no errors are found you are good to go.

Define the functions required

Function1: It is a function to determine or estimate the coefficients where x and y values are passed into this function.

Steps include:

  1. Calculate n
  2. Calculate the mean of both x and y numpy array.
  3. Calculate cross-deviation and deviation: Just remember here we are calculating SS_xy and SS_xx which is Sum of Squared Errors. As explained in the previous post-Linear Regression (Part 1).
  4. Calculate regression coefficients: The amount or value by which the regression line needs to be moved.

Function2: It is a function to plot the graph based on calculated values.

Steps include:

  1. Plot the points: “plt.scatter” plots the points on the graph where
  • x and y” are the locations of the points on the graph
  • color” is the colour of the plotted points change it to red or green or orange and play around for more possible colours click here.
  • “marker” is the shape of the points like a circle or any other symbols for different types of marker find it here.

2. Predict the regression line value: Take the minimum error possible, the regression line is decided here.

3. Plot the regression line

4. Labels are put here instead of just x and y ie the name for x and y are put on the graph here.

5. Show the plotted graph

Function3: Main function

Steps include:

  1. Gather the data sets required ie. x and y.
  2. Calculate coefficients required ie. the value of moving of regression line in both x and y-direction.
  3. Plot the graph

Lastly, write the main and call the main function:

So final result,

So here if asked for the price of the house for the size of 7Ksq feet the answer would be around 920K where the real value would be around 900K so the error is 20K.

Find the code on GitHub the link is below:
https://github.com/chethangn/SimpleLinearRegression

More references:

  1. Is Artificial Intelligence real or is it just a hype of this decade??
  2. Artificial Intelligence: Definition, Types, Examples, Technologies
  3. Artificial Intelligence vs Machine Learning
  4. Why Machine learning for achieving Artificial Intelligence? “ The Need for Machine Learning
  5. Machine Learning Types and Algorithms
  6. Linear Regression Part -1

Next I have Linear Regression (Part 3) where we implement multiple linear regression is coming up.

Make sure to follow me on medium, linkedin, twitter, Instagram to get more updates. And also if you liked this article make sure to give a clap and share it.

Join our WhatsApp community here.

--

--