Member-only story
Deep Dive Analysis of Missing Values in Dataset
What are the types of Missing Data with Examples?
The real-world dataset often has a lot of missing values. The cause of the presence of missing values in the dataset can be loss of information, disagreement in uploading the data, and many more. Missing values need to be imputed to proceed to the next step of the model development pipeline. Before imputing the missing values, it's important to understand the type of missing value present in the dataset.
Why missing data is a problem
Missing values present in the dataset can impact the performance of the model by creating a bias in the dataset. This bias can create a lack of relatability and trustworthiness in the dataset. The loss in values might contain crucial insights or information for model development.
The values missing in the dataset can be missed intentionally, randomly, or missed out for a reason. So missing data is considered a problem and needs to be handled before proceeding to the next pipeline of model development.
Missing data can be broadly divided into 3 categories, that need to figure out, and imputing the missing values solely depends on the category of missing data a feature is positioned:
- Missing Completely At…