(Image from unsplash)
In the analysis of data, I’ve seen a growing tendency to focus on increasingly sophisticated methodologies to the detriment of actual objectives of using data. There’s a draw to assigning overly simplistic metrics to every business problem to gauge the progress of products, processes, or employees. In the application of data science, there’s every pressure to be able to brag about using the latest and most powerful "AI" to superiors, investors, and competitors, to the extent of applying methodologies out-of-context and ineffectively. Successfully deploying an out-of-the-box AI technology is often easier than determining whether it achieves some business objective. I’d like to talk about the importance of understanding the full complexity of a data problem and aligning it with objectives. The most modern and powerful AI methodologies can be executed poorly in the absence of good judgment or good data.
Bad Data
This Berkeley study explores the plausibility that racial discrimination in consumer lending can be either ameliorated or exacerbated by automated recommendation systems, depending on the execution. Automated systems could plausibly reduce discrimination by avoiding the personal biases that arise from face-to-face meetings. However, even good faith attempts to create a fair system can manifest in economically damaging decisions if such biases are inherent to the datasets used.
A different application with a similar flaw was an Amazon-built recruiting tool that was found to have inherent biases against women. According to this Reuter’s article, the dataset used to train the model was heavily dominated by men because mostly men had applied for technical positions. Gender bias in technology fields has already existed because of long-held personal, cultural, and societal factors and that bias was therefore inherent to the historical training dataset used to build the recommendation model. The failure here was multifold: there was a failure to fully define the objectives from the start (to build a fair resume reader, not just one that perpetuates the status quo for the task); there was therefore a failure to build a training dataset that reflected an objective to build a fair resume reader; and there was a failure to evaluate performance a posteriori.
Assuming that the particular AI methodologies were applied and executed properly in both cases, the patterns inherent to the training data were learned because that’s what good AI methodology is supposed to do. If unintended and unwanted biases exist in that data, they will be reflected in the resulting models. A good classical statistician knows that datasets must be meticulously crafted and resulting models continually evaluated. A successful application of modern Data Science cannot forget the art of classical statistics.
Metric-Fixation
Not clearly defining objectives, not choosing adequate metrics that match objectives, and failing to evaluate the context in which metrics are applicable can all doom a data-driven project. As a thought experiment, let’s consider some overly-simplified and personal traps from the realm of physical fitness that many people fall victim to – an allegory about obsessing over overly simplified metrics while forgetting about objectives. I’ve personally known people that have roughly fallen into each of the following traps and would like to reframe each as data problems that went awry with familiar human nature (noting that in reality, both are serious psychological and emotional problems that are not to be taken lightly).
Consider individuals "Pat" and "Terry". Pat and Terry have decided to make a positive change in their lives and get in good shape. Pat’s perception of getting in good shape means losing weight while Terry’s is getting stronger. Pat and Terry don’t define their goals any more precisely than that and both have obsessive, perfectionist personalities.
Pat starts by exercising more and tracking his eating to maintain a caloric deficit. He chooses his body weight alone as the metric to gauge his progress by, as so many of us do. He starts to see progress and as the number drops he feels better, healthier, more energetic, and looks better. Time keeps going by and he gets addicted to the progress as the the rate of change in the metric grinds to a halt. More extreme exercise and more extreme caloric restrictions are needed to get the metric moving again. Obsessed with the metric, Pat’s health takes a turn for the worse as he continues to drop his weight to unhealthy levels.
Terry starts serious strength training and his metrics for progress are his one-repetition max weight lifted for bench press, back squat, and dead lift. Like any new lifter, his progress at the beginning is rapid and he loves the new found confidence and bounce in his step. The years go by and progress grinds to a halt for Terry as well. Exerting ever more effort to feel the gratification of progress according to his chosen metrics, the injuries accrue, training needs to be halted and progress is lost. Even if the injuries are overcome, there are very unhealthy ways to fuel progress in the metrics to new levels.
Pat and Terry are making data-oriented mistakes that have common parallels in a business environment. First, they did not precisely define their objectives. They can’t achieve healthy living without clearly defining the full context of what that means. Second, in the absence of precisely defined objectives, they chose overly simplified metrics that do not reflect the full context of what being healthy means. Losing weight can be an important metrics for a healthy body, but not in isolation from others that fully reflect what the human body needs to operate properly. Similarly, gaining strength can drastically improve physical well-being, but chasing high numbers without limit can stress the system in unhealthy ways. Lastly, even the admittedly-insufficient metrics that were chosen were not given a context of applicability. While losing weight and gaining strength are usually positive developments for someone’s health, there are obvious limits to how far they correlate to improved health.
These mistakes, an unhealthy fixation on metrics in the absence of a well-defined mission, are common in business and costly.
Social scientist Donald T. Campbell has an often-quoted law bearing his name:
"The more any quantitative social indicator is used for social decision-making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor."
In business and elsewhere, if you reward employees by some metric, this means employees will act in accord with the metric rather than the underlying purpose of the metric. If teachers are evaluated only on how well their students perform on standardize tests, they will teach only to the test. When Wells Fargo incentivized the creation of new savings and checking accounts, employees created millions of fraudulent accounts and the company was fined billions of dollars.
Data science and AI solutions are built with metrics. Perfectly built AI software cannot produce useful models if the metrics used to train the model do not reflect the complexity of the problem to be solved, if the metrics themselves contain biases that should not be present in the model, if the objectives of intended model are not adequately defined, or if the context of the applicability of each metric is not understood. The data science industry readily rewards those who have mastered the execution of particular advance AI methodologies, but too often shrugs at the more-important tasks of fully defining problems and cultivating intuitive knowledge of clean datasets.
Data Science Leadership
Effective data science leadership will require stepping back from obsession over the latest AI breakthroughs to understand the broader landscape of executing a modern data study.
Everything starts with a full and precise definition of the business objectives. Junior aspiring data scientists have spent their educational careers learning math, statistics, coding skills, database management, and the latest modeling methodologies. The excitement to brag, especially on one’s resume or at conferences, about successfully deploying the latest AI tech can be overpowering. Too often, it does not solve any existing problems. A truly successful application of a data science solution solves a problem with a purpose. Start with the objective and work your way back to the most appropriate solution for the problem, regardless of whether that solution is a bragging-rights worthy Word2Vec model or a simple linear regression.
Before exploring solutions, however, collect a dataset that accurately reflects the complexity of the problem to be solved. Understand the qualitative meaning of each metric in that dataset. Ask yourself if there are limits to the context in which each metric is useful. Ask yourself if there are variables that could affect your objective that are not reflected in that dataset. Are there problems that you should be fixing (like racial or gender biases) in decision making which are inherent to the very dataset you are using? Are there contradictions inherent to your dataset that need to be resolved before proceeding further?
Once you have a dataset that appropriately reflects the problem to solve, deciding on a methodology is critical. The first question I like to ask is, "what is the shape of the data?" If the data is highly relationship-heavy, then a graph representation and network science approach be most effective. If the data is time series oriented, forecasting may be the way to go. Free text data clearly points to natural language processing. Good judgment at this stage will determine success or failure.
Only after all these requirements are met does effective execution of any particular data science or AI modeling technique have a chance at being relevant or useful.