
To find a role in Data Science after graduation, students need to complete a comprehensive and applied curriculum including mathematics, statistics, and computer science/programming. They also need a solid business context for the analytics techniques, such as machine learning, that they will learn.
In addition to traditional business coursework, graduate degrees in data science and analytics often culminate in an industry-supplied consulting project that provides hands-on problem-solving experience with real-world business problems sourced from partnering companies. One issue that these industry-supplied consulting projects can highlight is that not all business problems can be solved with machine learning alone. Some problems require an optimal solution that can be used to make a data-driven decision – so not only predictive insights that offer a view into what will happen next, but a trusted prescriptive course of action that businesses can confidently operationalize.
Data science graduate degree programs often focus on teaching predictive analytics, but not all programs teach students prescriptive analytics (i.e. the ability to leverage optimization – the primary prescriptive analytics tool – to find solutions to complex business problems and make optimal decisions).
Over the last two months, I spoke with professors who are shaping top-ranked data science programs, industry experts, and leaders placing top data science talent in organizations as to why infusing Optimization into the data science toolkit in academic programs is an essential part in shaping the next generation of problem solvers.
We discussed emerging trends, how optimization is currently being taught to data scientists, and the value of introducing students to optimization problems so they are capable of leveraging combined solution approaches to not only offer their future employers predictive insights, but also the prescriptive power that can lead to trusted decision making.
The insights I gained from my conversations led me to conclude that there are three main reasons why optimization should be considered an essential element in data science and analytics programs:
1. BUSINESS LEADERS ARE EXPECTING MORE FROM THEIR DATA THAN EVER BEFORE.
Businesses are collecting more data on their customers, processes, and products than ever before. Forbes estimates that nearly half of enterprises are either starting new analytics projects or forging ahead on existing ones and highlights that companies who can’t quickly derive insights (and subsequent decision-making capabilities) from their data are falling behind.
The applications for optimization – a data-driven, prescriptive analytics technology – are massive, and prescriptive capabilities are impacting the way business is done on a global scale. Companies use massive quantities of data to inform critical decisions on everything from vaccine distribution plans and matching organ donors with recipients to global shipping logistics and scheduling over 100,000 flights each day.
Some companies are still not at that level of maturity in their analytics journeys, however. What is clear is that no matter the stage of analytical maturity of a company, business leaders are better informed than ever before and are expecting much more from their data.
In discussing the wide use cases of optimization, Dr. Ed Rothberg, CEO of Gurobi Optimization, posed the question to me: "There’s a collective realization that machine learning isn’t a solution for every problem, so what do you do then?" He discussed what he’s hearing from leaders as their awareness and understanding of optimization evolves, and his assessment is that it’s ultimately up to the business leaders to recognize that optimization opportunities should be prioritized to add value to the business.
According to Dr. Michael Watson, Northwestern adjunct professor and Coupa AI Leader, "Business leaders do understand the value of optimization, but they might not understand the specifics of what technology actually leads to it. They understand the value of using data to make a particular decision."
- IF COMPANIES DON’T HAVE PRESCRIPTIVE CAPABILITIES NOW, THEY SOON WILL.
Having access to optimal solutions, given many complex trade-offs, helps business leaders act quickly and confidently. Machine learning models can successfully predict what will happen next based on historical and real-time data, but they don’t help the business make optimal or explainable decisions on what to do next. If companies aren’t using optimization, they will be soon.
Gartner predicts that, by 2022, the prescriptive analytics software market will reach $1.88 billion (representing a 20.6 CAGR from 2017) and 37% of large and midsize organizations will be using some form of prescriptive analytics technologies.
There are many companies that are still in early stages of building out their analytics capabilities. Some have data science teams in place but haven’t moved beyond using their data to make predictions. Linda Burtch, managing director and founder of the quantitative executive search firm Burtch Works, shared her thoughts on the future trends of analytics teams: "Being able to get prescriptive capabilities is the holy grail, the leadership needs to believe in it, and they need to push for that. Companies have to walk before they run and a lot of these teams are just starting to trot right now, so give it time and I think more companies will get to that prescriptive approach."
- STUDENTS WHO AREN’T LEARNING OPTIMIZATION ARE AT RISK OF NOT BEING PREPARED TO SOLVE THE KINDS OF PRESSING QUESTIONS THAT COMPANIES NEED ANSWERED.
As I spoke with experts and academics, it became clear to me that almost everyone I connected with had a relevant story about a company providing an industry-supplied consulting problem (either in the form of a capstone or practicum project) to student teams that had both machine learning and optimization components within the problem statement. In most cases, the companies didn’t even necessarily realize that the problems had optimization components, but they did realize that they needed a decision and recommendations on implementation based on the data that they provided.
The industry-supplied problems couldn’t be solved with either prediction or prescription alone, and the students needed to incorporate mathematical modeling in order to provide a solution and insights on what to do next. I will reshare one of the stories that Dr. Joel Sokol, professor and director of Georgia Tech’s MS in Analytics program, shared with me: "We had a company come to us with a research question where they suddenly had access to all new datasets and they wanted to be able to start putting all that information together, but the datasets were from different sources and inconsistent (duplicates, inconsistent labeling, etc.). How do you go about matching data? There are predictive pieces and prescriptive pieces to answering that question and our students needed to think about all the models in their toolkit and how to use them without the artificial differentiation (between prediction and prescription)."
This kind of story was one that I heard multiple times, but it’s also one that I experienced first-hand. I was formerly the Associate Director for Northwestern University’s Master of Science in Analytics program where we required two sets of industry-supplied consulting projects for each cohort of students. The projects were embedded in the curriculum to introduce our students to real-world problem solving and to provide the kind of hands-on experience that would be beneficial as they prepared for roles as data practitioners and future leaders in data science. There were multiple projects submitted over my years with the program that required optimization to provide a useful deliverable back to the company that could be implemented.
I was lucky enough to work with a faculty director, Dr. Diego Klabjan, who recognized the value of data scientists understanding the full range of analytics techniques. He championed data science students learning optimization early in the program and was one of the pioneers in offering an optimization course as a core requirement so students could identify the correct analytical approach for a problem and solve it efficiently.
Joel Sokol (another pioneer in graduate-level data science education) expanded on this idea and explained how his students are tackling problem solving in a program that’s interdisciplinary by nature: "We present the business question to the students, and we don’t say whether it’s predictive or prescriptive. The students need to figure out what data is needed to answer the questions, and what models are needed to be able to get to where they need to be to solve the problem."
Teaching Optimization to Data Science Students
Data science students are great problem solvers by nature. They often enter their data science or analytics graduate degree programs with a strong math background, a foundation in programming (increasingly, we are seeing Python as the dominant programming language), and a natural curiosity of how to discover valuable patterns in the data. These are students with a quantitative aptitude and appetite to find the best solution, that capability lends itself naturally to solving optimization problems – and, whether or not they know it, they are already using machine learning techniques that draw on optimization.
Michael Watson explained that "a good data scientist already knows math. They use statistics and the algorithms they build aren’t too far away from mathematical modeling. Optimization is behind the scenes in so many of the algorithms that data scientists already use, regression models use optimization, deep learning has optimization embedded in it."
Data Science and Analytics master’s degrees often take about one year to complete, though programs can range from nine months to two years. Programs are tasked with providing a curriculum that prepares students for lucrative careers in data science. They often have only two semesters worth of courses with which to prepare their students, and they need to make the tough decisions on what is essential for their students to be set up for success as data scientists. It makes sense that with those types of tough decisions, many programs have opted not to include optimization in their core requirements. Mike Watson explained that "most students had never seen, or even heard of optimization, so it is important to introduce them to this new way of using data to make a decision and actually using data to come up with an answer and prescribe a solution, which is different than using data to make a prediction (which is what traditional machine learning programs teach). Without optimization, students are missing opportunities."
The kinds of problems that companies need solved are becoming more complex, and the ability to not only recognize optimization problems, but also to provide optimal solutions to business leaders will be a differentiator for data science students entering the job market in the coming years. Joel Sokol has seen that success firsthand, and shared that "optimization is one of those elements that is considered a core piece of our curriculum. It’s a fit in the program and some of our alumni even say that it’s a core differentiator. They can now take the extra step: now that they have a good understanding of how things work and decent predictions of what’s going to happen, what do they do with that? How do they make decisions based on that? That optimization piece really comes in as that next step in the data science progression."