The Rise of the Term “MLOps”

Properly Operationalized Machine Learning is the New Holy Grail

Kyle Gallatin
Towards Data Science

--

“MLOps (a compound of Machine Learning and “information technology OPerationS”) is [a] new discipline/focus/practice for collaboration and communication between data scientists and information technology (IT) professionals while automating and productizing machine learning algorithms.” — Nisha Talagala (2018)

For folks interested, I’ll also be teaching a Machine Learning in Production class in 2022–23!

Torrey Pines Gliderport in San Diego for intro aesthetic

The understanding of the machine learning lifecycle is constantly evolving. When I first saw graphics illustrating this “cycle” years ago, the emphasis was on the usual suspects (data prep and cleaning, EDA, modeling etc…). Less notice was given to the more elusive and less tangible final state — often termed “deployment”, “delivery” or in some cases just “prediction”.

At the time, I don’t think a lot of rising data scientists really considered the sheer scope of that last term (I sure as hell didn’t). “Prediction” didn’t just mean .predict(), it implied true scale, production-level deployment, monitoring and updating — a true cycle. Without the engineering skills needed to make this vague concept reality, the data scientist was stuck in the notebook. Models lived as .pickle files on a data scientist’s local machine, performance was reported with Powerpoint, and the ML lifecycle was broken.

A Straightforward, but Incomplete Data Science Lifecycle with the word “Modeling” spelled incorrectly

While the end-to-end ML lifecycle has always been pitched as an actual “cycle”, to date there has been limited success in actually managing this end-to-end process at enterprise level scale for what I see as the following reasons:

  • Data scientists are not often trained engineers and thus do not always follow good DevOps practices
  • Data engineers, data scientists, and the engineers responsible for delivery operate in silos which creates friction between teams
  • The myriad of machine learning tools and frameworks fosters a lack of standardization across the industry
  • There is not yet a single, managed solution that meets the requirements of engineers and data scientists without being limiting in some way (subscribed to a specific language, framework, provider, etc…)
  • In general, enterprise machine learning in production is still immature

I’m sure there are more reasons and plenty of sub-reasons that also contribute here — but these high-level issues lead to unfortunate results. Machine learning in enterprise is slow, tough to scale, not much is automated, collaboration is difficult, and the actual operationalized models delivering business value are few.

My boss made this gem

Thus, we have the need for good “MLOps” — machine learning operations practices meant to standardize and streamline the lifecycle of machine learning in production. Before I jump into the landscape or any more definitions, however, I want to talk a little bit more about why we need better MLOps.

Machine Learning is Somewhat Mature, but Deployment Practice and Business Impact are Not

In the academic space, machine learning has advanced leaps and bounds. Algorithms are showcasing vast improvement over previous work in difficult tasks like NLP (even if that’s just Google throwing more data+compute at it), and the number of machine learning papers on arXiv is said to be doubling every 18 months.

This excitement makes it easy to lose focus on the deliverables — the tangible impact. This Databricks survey from 2018 showed that while the majority of companies are adopting or investing in AI, they are also universally citing difficulty in their solutions, with “AI” projects taking on average 6 months to complete.

We’ve all been there

If you aren’t careful you can end up with data scientists literally emailing Python notebooks and models to engineers for production deployment and code rewrites. Maybe this poorly documented Python code is too inefficient or incomplete for Docker, but will also take a lot of time to translate to Java. Without extensive documentation the engineer has no idea wtf is in that .pickle file, there’s 0 version control for the model, metrics and parameters, and everyone is confused and pissed because now they’re stuck in painful meetings trying to align on something that should be taking days instead of months.

Not only that but once the solution is in production there is no inherent feedback system for improvement and updates. For certain systems, model performance is likely to suffer over time and monitoring isn’t always standard practice. Data scientists also aren’t really trained to do things like writing good test cases, so I wouldn’t be surprised if model updates going right from Jupyter to production occasionally break applications or serve incorrect predictions.

Model Drift — the importance of monitoring for change in distribution

I want to be clear and say that these issues are not universal, and plenty of custom solutions do exist. However, a single set of practices and end-to-end solution have yet to surface and meet the needs of the data scientist, the engineer and the business.

Enter Machine Learning Operations (MLOps)

I would define MLOps in its purest form as the true instantiation of the automated production ML lifecycle. The first paragraph of the Wikipedia page for MLOps also kinda says it all. MLOps is the logical reaction to the current difficulties enterprises face putting machine learning into production. In software engineering we have DevOps, so why not MLOps? Good DevOps ensure that the software development lifecycle is efficient, well documented, and easy to troubleshoot. It’s about time we developed a similar set of standards for machine learning.

A slightly better infographic for the machine learning lifecycle

Industry has begun to hit its breaking point, and technology is evolving rapidly to meet demand and alter the current standard for ML in production. Open source frameworks like mlflow and kubeflow compete to become the standard of the open-source landscape, while new startups slap UIs on these solutions in an attempt to bring “proprietary” MLOps products to market.

The State of MLOps

Of course, MLOps is still somewhat in its infancy (in practice at least). A search for “MLOps” on Towards Data Science yields a measly 2 results (at time of writing). Technically speaking, a fully managed solution with tools like mlflow or kubeflow still requires a reasonable amount of development and/or employee education to use in practice.

Kubeflow is pretty tight

Now, you’ll also notice I haven’t actually given an exact list of MLOps principles, and that’s because I’m not sure a universal set yet exists. This idea is still in flux, and true principles will formulate with the fruition of new frameworks and real-world lessons. It’s important to note that like DevOps, MLOps can be good or bad — and over time the line between the two will become clearer.

For now, I find it best to follow good DevOps practice. There are of course tools that make this work easier, but from a framework-agnostic POV, I’d imagine good MLOps will look a lot similar to good DevOps. The goals for MLOps remain clear, and good MLOps would be accomplishing these as efficiently as possible:

  • Reduce the time and difficulty to push models into production
  • Reduce friction between teams and enhance collaboration
  • Improve model tracking, versioning, monitoring and management
  • Create a truly cyclical lifecycle for the modern ML model
  • Standardize the machine learning process to prepare for increasing regulation and policy

Conclusion and Caveats

I’m sure depending on where you are in the industry, you may agree or disagree with what I’ve surmised about the landscape. These viewpoints are the results of my limited experience, and as such are privy to misconceptions. As Obi-Wan said to Anakin: “Only a Sith deals in absolutes”, and I believe that to be true in my subjective analysis of all things.

Brooklyn sunset for conclusion aesthetic

Still, the purpose of this article was to introduce MLOps as both a concept and possibly one of the next great revolutions in enterprise ML — and I hope to that end I’ve been useful. Feel free to connect with me on LinkedIn, or leave unnecessarily scathing comments below. ✌️

--

--

Software Engineer for ML Infra. Building scalable, operationalized machine learning services. I don’t represent my employer.