Member-only story

Deep Dive Analysis of Missing Values in Dataset

What are the types of Missing Data with Examples?

Satyam Kumar
Towards Data Science
4 min readJan 13, 2021
Image by Gino Crescoli from Pixabay

The real-world dataset often has a lot of missing values. The cause of the presence of missing values in the dataset can be loss of information, disagreement in uploading the data, and many more. Missing values need to be imputed to proceed to the next step of the model development pipeline. Before imputing the missing values, it's important to understand the type of missing value present in the dataset.

Why missing data is a problem

Missing values present in the dataset can impact the performance of the model by creating a bias in the dataset. This bias can create a lack of relatability and trustworthiness in the dataset. The loss in values might contain crucial insights or information for model development.

The values missing in the dataset can be missed intentionally, randomly, or missed out for a reason. So missing data is considered a problem and needs to be handled before proceeding to the next pipeline of model development.

Missing data can be broadly divided into 3 categories, that need to figure out, and imputing the missing values solely depends on the category of missing data a feature is positioned:

  • Missing Completely At

Create an account to read the full story.

The author made this story available to Medium members only.
If you’re new to Medium, create a new account to read this story on us.

Or, continue in mobile web

Already have an account? Sign in

Published in Towards Data Science

Your home for data science and AI. The world’s leading publication for data science, data analytics, data engineering, machine learning, and artificial intelligence professionals.

No responses yet

What are your thoughts?