How to Create a Representative Test Set

Assessing your model’s performance with confidence, using representative test data sets.

Dimitris Poulopoulos
Towards Data Science
6 min readMay 7, 2020

--

Image by andrasgs from Pixabay

Splitting a data set into train and test sets is usually one of the first processing steps in a machine learning pipeline. After this point, there is just one inviolable rule: set aside the test data split and refer to it again only when your model…

--

--

Machine Learning Engineer. I talk about AI, MLOps, and Python programming. More about me: www.dimpo.me