An Introduction to K-Nearest Neighbors Algorithm
What is KNN?
KNN
The K-Nearest Neighbours (KNN) algorithm is one of the simplest supervised machine learning algorithms that is used to solve both classification and regression problems.
KNN is also known as an instance-based model or a lazy learner because it doesn’t construct an internal model.
For classification problems, it will find the k nearest neighbors and predict the class by the majority vote of the nearest neighbors.
For regression problems, it will find the k nearest neighbors and predict the value by calculating the mean value of the nearest neighbors.
The main concept behind KNN is
Birds of same feather flock together
Topics covered in this story
- KNN Classification
- How to find the optimum k value?
- How to find the k nearest neighbors?
- Euclidean distance
- Why KNN is known as an instance-based method or Lazy learner
- Example dataset, dataset representation
- Calculating Euclidean distance
- Finding nearest neighbors
- Python Implementation of KNN Using sklearn
- How to find the best k value? Plot error vs k values.
KNN Classification
Let’s learn how to classify data using the knn algorithm. Suppose we have two classes circle and triangle.
Below is the representation of data points in the feature space.
Now we have to predict the class of new query point (star shape shown in the figure). We have to predict whether the new data point (star shape) belongs to class circle or traingle.