Solving Machine Learning’s ‘Last Mile Problem’ for Operational Decisions

Published in

Towards Data Science

7 min readSep 4, 2019

An analytical insight or trained model has little value if it’s not being used.

In fact, a central tenet of the value proposition for Data Science is that Machine Learning (ML) models can be interpreted and applied in a business context. Insights, classifications and predictions need to influence decision makers, reach front-line staff, or be embedded in business applications. However, many Machine Learning projects fail in this regard. This is Data Science’s version of the last mile problem, and one of the main obstacles in the broader quest to “operationalize” ML.

When attempting to use ML for operational decisions, i.e. embedding ML-based models within operational applications and business processes, the Last Mile problem can be subdivided into:

Interpreting model responses to make decisions
Create trust, transparency and organizational change
Govern lifecycles and assets across Data Science, IT and Business

This article will focus on the highlighted first item, which is about how we invoke ML models, interpret the response, and combines that response with business policy in order to make decisions.

Today, there exists an impressive software- and hardware stack for Machine Learning that supports end-to-end gathering of data, creation of data pipelines, feature engineering. training of models, visualization of results, model monitoring and model deployment behind REST APIs. Very simplified, this process can be illustrated like this:

Machine Learning Pipeline vs The Last Mile

To date, not enough consideration has been given to what happens after the model has been “deployed”, i.e. the “Last Mile”. Deploying a Machine Learning model behind a REST API is a good starting point, but it’s the beginning of the journey, not the end. To make your ML-based predictions useful in a business application or -process you need to figure out when to invoke the model, how to interpret the response and how to convert that response into an actionable decision.

Invoking a model

First of all, you need to decide when you can safely use predictions made by an ML model. Typically a model is not reliable outside the bounds of the data it’s been trained on. For example, a risk model for auto insurance that’s been trained on drivers aged between 25 and 65 might not be robust in predicting risk for younger drivers < 25, or seniors > 65. This generalizes to combinations of features as well. A model trained with sufficient data about young drivers in urban areas doesn’t necessarily offer reliable predictions about young rural drivers. Safeguarding the ML model from cases it’s not trained for is essential to safeguarding your business.

This leads to the need for business rules to decide when not to use a model, when to use it as-is, and when to use it with some posteriori adjustments. This choice occurs frequently whether you’re targeting new customer segments, introducing a new product or processing a new type of transaction.

The first option is to ignore the model at first, bootstrapping the business until you’ve got enough data to be confident about the predictions. Typically this means relying on perhaps less-precise predictions using human-authored business rules or decision tables. If this is what you used before adopting ML, they can be handy to keep while ramping up.

In other cases, you can venture that the new product/transaction/segment is similar enough to something you already have enough data about. Perhaps the “risk of young rural drivers = risk of young urban drivers — 5%” is a better risk assessment than no data-based prediction at all. Again, a few business rules can make that connection, invoke the model and make the adjustment.

In practice, for enterprise decisions, virtually all invocations of ML models will be surrounded by at least a few business rules or -tables that decide what model(s) to invoke for each specific case.

Adjusting Predictions

Availability of data may limit the predictive power of Machine Learning models. We might not have sufficient data about features we know or suspect should impact a prediction. In such cases it can be necessary to adjust a score or modify a classification after the ML model has been invoked.

For example, public datasets for real estate transactions are available in several countries and geographies. The French government’s version is quite comprehensive, containing all real estate transactions in France from 2014 to 2018. The data include location of property, number of rooms, size of interior living space, size of plot, etc. What’s NOT included, however, is whether the property has a view, a pool, state of interior amenities and last renovation date. So while it’s possible to build a regression model to estimate the value of a property, it can presumably be off by +/- 20% if these additional parameters are not taken into account separately. A set of business rules or decisions tables can easily be used to adjust — up and down — the property value, arriving at a much better prediction.

This situation is not unique to real estate. Risk scores in financial services, product recommendations in retail, fraud detection in payments, etc, they all can benefit from adjustments based on real-time data not available at model training time.

Perhaps someday every piece of data will be available in clean and digestible form for our algorithms, but in the meantime it’s often necessary to make adjustments to our baseline predictions.

Combining Models

Running a business is about balancing risk vs return. Making that judgement extends into operational business decisions as well. For example, what’s the estimated profit from extending a specific loan vs the risk of the customer defaulting? What’s the probability of customer churn if we don’t offer the loan? What’s the impact on customer loyalty if we block a credit card transaction, vs the risk of loss if it’s fraudulent?

To make operational decisions, we often need to stitch together a group of predictive models and policy rules. For example, in financial services, here’s a typical list of components to consider to make a sound decision:

Risk score (predictive model or risk table)
Eligibility (policy rules)
Life-Time Value (predictive model)
Churn score (predictive model)
Pricing policies (policy rules)

Sometimes it’s even necessary to combine multiple models that aim to predict the same thing, but are derived from separate data sources. For example, in healthcare there exists a multitude of studies around diabetes, but the differences in research methodology — length of study, control group design, data collected, etc — make it difficult to merge the data and train a single ML model. In effect, the best approach might be to train and deploy multiple risk models, and use business rules to calculate a weighted score based on the patient being evaluated.

In effect, multiple ML models are often used in a business decision, combined with business rules that express policy. Modeling a decision is the art of combining these predictive- and prescriptive assets.

Applying Policy

The vast majority of enterprise decisions are subject either to business policies, industry regulations or should adhere to “common sense” rules . Here are some examples:

Product Recommendations

Don’t promote offers that customers already have
Don’t promote offers conflicting with outstanding offers
Don’t promote unnecessarily high discounts (by customer segment/tier)

Advertising

Restrict what ads are shown based on demographics, geography and channel
Don’t display conflicting ads or ads from competing brands (“brand-safe advertising”)
Optimize ad delivery in order to satisfy, but not exceed, ad budgets

Insurance

Enforce manual claims review on claims from previously fraudulent customers
Automatically pay out low-value, low-risk claims (based on customer segment/tier)

In some cases, business rules like these are soft preferences rather than hard rules, and they can be embedded in the ML model through training. In other cases this is difficult or undesirable. How would you ensure you’re not displaying competing ads to specific customers (ideally across sessions and channels)? What ML feedback mechanism could prevent a model from offering products a customer already owns? How would your company demonstrate to auditors and agencies that your decisions followed regulations (100% of the time)?

The truth is that Machine Learning is a probabilistic method, and not ideally suited to adhere to deterministic policies and rules. Applying business rules after ML-based predictions or classifications will typically deliver both better conformance and transparency.

Conclusions

A “purist” vision of Machine Learning might suggest that business policy isn’t needed, or that such policy can and should be learnt from data. This school of thought says that by connecting learning to the right real-world outcomes and designing effective feedback loops (including maybe randomization and A/B testing), the ML system will gradually learn to take the correct and optimal decision without a posteriori intervention.

A purist “ML-only” approach might work well for playing chess, recognizing images or classifying text. For enterprise decisions, ML is rarely sufficient by itself.

In practice, whether to implement a specific requirement by “applying policy with rules” or “train the ML model” is a case-by-case consideration. For organizations new to Machine Learning, or when infusing ML into an existing business processes, it is usually easier to enforce policy requirements outside of the ML model, at least to start with.

You might be asking what software you could leverage to cross the Last Mile with Machine Learning? While the typical Machine Learning framework doesn’t extend much beyond model deployments, there’s another software category that fills the gap between ML models and Business Applications.

Most often referred to as Decision Management — alternatively “Digital Decisioning Platform” by Forrester — these platforms came out of the era of Business Rules Management Systems (BRMS), but has grown to focus on decision modeling and -execution as a whole. This means that Decision Management platforms span both predictive models, prescriptive business rules and in general enables modeling, monitoring and governance of operational decisions.

Greger works for IBM and is based in France. The above article is personal and does not represent IBM’s positions, strategies or opinions.