Model Drift in Machine Learning

Understanding and Dealing with Model Drift

Kurtis Pykes
Towards Data Science
5 min readAug 3, 2021

--

Photo by Aral Tasher on Unsplash

All things tend towards disorder. The second law of thermodynamics states “as one goes forward in time, the net entropy (degree of disorder) of any isolated or closed system will always increase (or at least stay the same)”. Thus, nothing lasts forever. Our youth is not forever, the best becomes the worst, and our machine learning models degrades as time does its thing.

The world is not static, it’s dynamic and continually changing. A spam email from the 2000s isn’t the same as a spam email in 2021. The features used to detect fraudulent emails in 2021 would differ significantly from those of the 2000s — people got smarter, including scammers. If we attempted to use a model developed in the year 2000 to classify whether emails from the year 2021 are fraudulent or not, we could expect to see the predictive power of the model worsen in comparison to a fraudulent email from the year 2000. This paradigm describes a concept known as model drift.

Model Drift is the decay of a model's predictive power as a result of alterations in the environment. If thingsstayed the same, i.e. the environment and the data, we should expect our machine learning models predictive power to remain constant. However, we all know the real world is ever changing. The changes in a real world…

--

--