Somebody enjoying fresh air

Keeping your production model fresh

Robert de Graaf
Towards Data Science
4 min readMar 21, 2018

--

A great model needs love and attention if it is stay as useful as it was on day one for its whole life.

Any statistical or machine learning model will experience a loss of performance over time as relationships alter. Sometimes this happens very suddenly, such as happened to many credit default models during the GFC. Other times the degradation takes place over a longer period, and can almost be predicted by someone watching the trend.

What drives the degradation? First of all, no matter how careful you were, to some degree your model fit on noise or on latent factors, which is to say it was wrong to begin with, and some your accuracy was due to random chance.

The next problem is things actually do change. Think of a car insurance model, which have variables relating to vehicle specifications. Some of those variables will be effective risk indicators because they are correlated with the buying choices of risky drivers, but as buying preferences change over time, for example favouring more or less fuel-efficient cars as fuelprices change, the correlation will change as well. Similarly, a credit default model trained during good economic conditions can easily miss the factors that increase the chance of default during poor conditions.

Intuitively, a point that emerges from those two examples is that models which are dependent on human behaviour may be especially susceptible to degradation, whereas models that relate more closely to physical processes in some sense may have some additional stability. It follows in turn that a key ally in understanding how much of a risk this is for your model and over what time frame will be your subject matter expert, and in most cases a regular schedule of model review and retrain will be developed.

At the same time, you will likely want to use what your data is telling you, so you’ll need methods to determine whether the newly arrived input data has changed. This is particularly the case for circumstances which change quickly.

In the case of input variables where the data points have a high degree of independence, control charts, as used in Statistical Process Control, could be used to detect changes to the process.

There are many guides to the use of these charts both in print and on line, and they have been successfully used for many years. Their common element is that measurements from a process are plotted sequentially on a chart with a centreline at the mean (or other appropriate process average) and upper and lower lines to represent the usual process range. Accordingly, it is easy to establish when the process has changed either its range or its average result.

However, especially for attribute or categorical data, methods developed for use on relatively small data can give problematic results when used on much larger quantities of data.

The qicharts package in R has implemented one solution to this problem — the ‘prime’ charts developed by David Laney, which continue to give accurate results when large subgroups are used. This package contains a full range of quality control charts, so you will be able to find one that fits your needs.

Some care is still required in setting up the sampling regimen for continuous data — and note that it is not necessary to use the complete data collected every day to check whether the input variable’s process maintains the characteristics it had when the model was implemented, just that it is large enough to be representative.

Of course, it is necessary to balance the effort involved in reacting to changes against the benefit, especially in models with large numbers of inputs. If you have completed an FMEA on your model, for example when first implemented, you will already have a sense of the relative importance of different inputs, and the impact to overall model performance of changes to particular variables. In some cases, it might be unnecessary to take any action; other changes might warrant immediate action to prevent poor decisions being made.

Sensible model surveillance combined with a well thought out schedule of model checks is essential to keepinng a great production model up to date. Priortising checks on the key variables and setting up warnings for when a change has taken place will ensure that you are never caught by a surprise by a change to the environment that robs your model of its efficacy.

Check me out on leanpub!

--

--