Careful! Looking at you model results too much can cause information leakage

Paul May
Towards Data Science
6 min readMay 2, 2019

--

It’s always to use as much data as you can when building machine learning models. I think we all are aware of the issue of overfitting, which is essentially where the model you build replicates the training data results so perfectly its fitted to the training data and does not generalise to better represent the population the data comes to, with catastrophic results when you feed in new data and get very odd results.

--

--

Data Scientist, Astrophysics PhD, reliability engineer and part time writer. I love exploring the world of science and how it shapes the world we live in.