An Introduction to K-Nearest Neighbors Algorithm

What is KNN?

Indhumathy Chelliah

Published in

Towards Data Science

8 min readNov 23, 2020

Photo by Asad Photo Maldives from Pexels

KNN

The K-Nearest Neighbours (KNN) algorithm is one of the simplest supervised machine learning algorithms that is used to solve both classification and regression problems.
KNN is also known as an instance-based model or a lazy learner because it doesn’t construct an internal model.

For classification problems, it will find the k nearest neighbors and predict the class by the majority vote of the nearest neighbors.

For regression problems, it will find the k nearest neighbors and predict the value by calculating the mean value of the nearest neighbors.

The main concept behind KNN is

Birds of same feather flock together

Topics covered in this story

KNN Classification
How to find the optimum k value?
How to find the k nearest neighbors?
Euclidean distance
Why KNN is known as an instance-based method or Lazy learner
Example dataset, dataset representation
Calculating Euclidean distance
Finding nearest neighbors
Python Implementation of KNN Using sklearn
How to find the best k value? Plot error vs k values.

KNN Classification

Let’s learn how to classify data using the knn algorithm. Suppose we have two classes circle and triangle.

Below is the representation of data points in the feature space.

Data points representation [Image by Author]

Now we have to predict the class of new query point (star shape shown in the figure). We have to predict whether the new data point (star shape) belongs to class circle or traingle.

An Introduction to K-Nearest Neighbors Algorithm

What is KNN?

KNN

Topics covered in this story

KNN Classification

Written by Indhumathy Chelliah