Exploring the ML Tooling Landscape (Part 2 of 3)

Current ML tooling and adoption

Published in

Towards Data Science

11 min readMay 12, 2022

Photo by Possessed Photography on Unsplash

In the previous blog post in this series, we examined overall machine learning (ML) maturity in industry with a specific focus on machine learning operations (MLOps). The two main takeaways were the striking lack of ML maturing in industry as a whole, as well as the complexities involved in fully embracing MLOps, which can be taken to represent the apogee of ML maturation. In this blog post, we will consider the implications for tooling adoption in industry and the wider ML tooling market.

In this series of blog post, I aim to address the following questions,

What is the level of maturity with regards to ML in industry?
What is the current state of ML tooling and adoption?
What are likely trends in ML tooling?

This blog post is concerned with the second question.

As with the previous post, the same disclaimer applies: This series of blog posts is by no means meant to be exhaustive — or necessarily even correct in places! I wrote this to try to organise my thinking on the reading I’ve done in recent weeks and I want this to become a jumping off point for further discussion. This is an absolutely fascinating field and I am really keen to learn more about the industry, so please get in touch!

The Lay of the Land

It is by no means an overstatement to talk of a crowded ML tooling landscape. The aptly named MAD Landscape (that is, Machine Learning, AI, and Data) lists over 2,000 (this number includes some duplicates) commercial and open-source projects and companies as of late 2021 (Turck, 2021). And that is just one of a handful of similar reviews: (TWIML, 2021), (LF AI & Data Foundation, 2022), (Hewage, 2022), (Ruf, 2021).

Before going any further, it makes sense to clarify some of the terminology on this topic. Frequently find end-to-end ML platforms referred to as MLOps platforms (thoughtworks, 2021), data science and machine learning (DSML) platforms (Turck, 2020), among other things. However, following the discussion in the previous blog post it makes sense to use the term “ML platform” to encompass all such offerings as arguably MLOps is generally the end goal of ML maturity. I will also tend to drop the qualifier “end-to-end”, as ML platforms generally aim to serve all steps of the ML workflow. Separate from platforms, the term tooling is used to refer to software that addresses a specific step within an ML workflow. I may occasionally use tooling to refer to them together, which should be clear by context, or otherwise use “solutions” to refer to the two collectively.

Although the several reviews differ in their methodology and breadth, there are a number of points of agreement. Firstly, there is an observed lack of maturity within the tooling ecosystem itself, referring to the absence of any one platform that supports a fully automated ML workflow or at least not one that is accessible to a wide range of stakeholders (Ruf, 2021). A related point can be made for specialist tools, which depending on their approach, may support more than one ML task (TWIML, 2021). Secondly, an ML platform is considered to be “good” to the extent that, among other things, it covers all aspects of the ML workflow, allows for automation, and supports multiple languages, frameworks, and libraries (Hewage, 2022). Thirdly, within existing solutions, there is a wide divergence in the level of implementation for different ML tasks (Felipe & Maya, 2021). It would be reasonable to speculate that many of these issues are a direct consequence of the lack of agreement as to how to practically implement a ML workflow in practice, as discussed in the last blog post.

Additionally, a number of articles have structured the various offerings along the lines of completeness of solution i.e. how much of the ML workflow it supports, and the “kind” of offering. That is to say,

Is it a platform or specialised/piecemeal tool?
Is it ML specific or applicable to software engineering (SWE) in general?
What stage(s) in the ML lifecycle does it apply to?

This later criterion is particularly subjective as it is debatable as to what specific tasks a given tool relates to e.g. data observability touches pretty much everything (TWIML, 2021). Two such frameworks are given by (thoughtworks, 2021) and (Miranda, 2021a). I prefer the latter, as it provides a means to understand ML tooling in the wider SWE environment, as well as providing the basis of a simple framework to understand the current state of tooling maturity, even though it does not directly link tools to ML workflow states. This framework will prove insightful when we consider how to appraise tooling choices below.

ML Tooling Adoption

We ended the previous post by examining ML maturity in industry, noting that there are two distinctive groups amongst enterprise companies: mature companies at the forefront of ML, and immature companies who have only launched ML/AI products in recent years. Furthermore, we examine the predominance of in-house tooling, at least within mature companies. In fact, in-house tooling is prominent in both mature and immature companies in this field, but for different reasons.

In the case of mature companies, this is largely due to the absence of (mature) alternatives — after all, they are leading the field. This is especially true when thinking of MLOps tooling specifically: feature stores, model serving etc. In a review of MLOps specific platform components, amongst predominantly large US B2Cs, only the workflow orchestration component of ML platform frequently used open-source solutions, typically Kubeflow Pipelines or Airflow (Chan, 2021). This may be because this is the most generic component considered (dotscience, 2019). Among such companies, the goals of developing in-house ML platforms are twofold (Symeonidis, 2022), firstly the aim is to reduce the time required to build and deliver models, and secondly to maintain the stability and reproducibility of predictions; or in other words to address the delivery and deployment phases of MLOps respectively. Might this picture change as the tooling ecosystem matures as a whole? Probably not all that much. This is for three main reasons. First, any extant tooling has emerged in direct response to a specific strategic response for a given company, for which it may not be desirable to make more widely available, or could not easily be replicated externally if considering the two desiderata previously given. Secondly, and relatedly, it may not be possible to generate a fully featured alternative externally. Thirdly, success at this level is not determined by one kind of technology, but rather its end-to-end integration, which introduces an additional barrier to third-party tools. As Matt Turck says: “Big Data [or ML/AI] success is not about implementing one piece of technology … but instead requires putting together an assembly line of technologies, people and processes.” (Turck, 2016).

In the case of immature companies, the tendency for in-house solutions is in large part precisely due to this immaturity: they may not have either the technical skills or budget to acquire third-party tools (dotscience, 2019). Relatedly, they may find it hard to justify the need for such tools precisely because there is no benchmark for comparison, especially if one considers that these tools are used in large part to optimise a known process. As we will see below, there is an insurmountable tradeoff between either creating an ML workflow in a piecemeal way, or selecting a platform that covers a workflow end-to-end; either approach necessarily introduces technical debt (Ruf, 2021). Either way, it may also be hard to find offerings that match a company’s requirements, stemming from the lack of consensus around ML workflow implementation (see previous post) and the absence of feature-complete ML platform (TWIML, 2021). Conversely, it may be too challenging to match third-party options around existing ML pipeline implementations (to be discussed in detail below). A related concern: for many of these companies, available tooling may not be appropriate for their needs for a number of reasons, notably due to various compliance issues. For instance, the current crop of data labelling solutions, which typically use outsourced or contract workers, are not likely appropriate in many instances such as regulated fields like financial services (Vanian, 2021). For many enterprises, it is generally more desirable to ensure full control and ownership of data, and avoid distribution of data to multiple parties. Companies like Snorkel AI aim to directly address these concerns.

Taken together, these points may indicate that the diversity of the current tooling landscape is a direct consequence of the need to address numerous different implementations that may be highly company-specific. Beyond this, the general VC funding glut and other available funding (Turck, 2021) has enabled the rapid expansion of companies and solutions in this space.

Decisions, decisions, decisions

Given the previous discussion, it would be tempting to ask how to actually choose a suitable platform and/or set of tools for an immature ML implementation? As expected, there is no real right answer, however intelligent choices can be made to ensure you are able to make meaningful progress in this area whilst minimising problems later down the line.

The fastest route to full MLOps adoption is to focus on the full end-to-end of the ML lifecycle, or to use the terminology used in (Stiebellehner, 2022a) to first focus on the depth of functionality, and only then the breadth of features. This has the added advantage of reducing the time and costs associated with idea validation, so that meaningful feedback can be obtained sooner rather than later.

As has already been discussed, there is currently no widely accepted complete end-to-end ML platform available. This leaves choosing between the current crop of ML platforms and/or piecing together an ML workflow from various standalone, specialist tools. From a strategic perspective, it may be undesirable to go all in with one platform provider as they may not fit your needs long term, and may lead to vendor lock-in, meaning any subsequent change represents a greater opportunity cost. Conversely, developing processes around standalone tools may mean creating more work internally than is desirable, especially if speed of delivery is the main goal. In any case, following the framework introduced above, the more mature a given tool or platform, and to a certain extent the more piecemeal the tool is, the safer the choice (Miranda, 2021b).

Especially for businesses where ML is not a core function, the general advice is to buy rather than build, and add custom integrators between these components whether these are platforms or specialised tools (Miranda, 2021b). Tool selection should be based around immediate delivery goals, with a focus on the integration and overlap between these components; the goal is to select the smallest satisfying set of options. As the focus is on depth over breadth, factors such as scalability or automation should not be the primary concerns. Questions to ask include, what are the kinds of languages each tool supports to aid interaction, and whether some selection of tools will introduce redundancy due to their overlapping functionalities; questions such as these can only be answered with experience, so there is a great need to iterate and learn from your process (Ruf, 2021). An illuminating example of tooling adoption is provided by Kubeflow, which although provides support for an end-to-end ML platform, has only seen particular elements such as Kubeflow Pipelines finding widespread adoption.

What’s on the Horizon?

One of the central debates around the current tooling constellation is if and when we should expect to see some kind of consolidation. Most observers would expect some kind of rationalisation, however the parameters are far from clear (Turck, 2021). Consolidation of a kind, “functional consolidation”, has already been observed for established and successful solutions, like Databricks and Snowflake, where there is a trend towards a fully featured, end-to-end offering (Turck, 2021). This may well be part of the more general trend of many successful third-party solutions, which have generally been a component of a much larger ecosystem e.g. Snowflake and AWS. Beyond this, we might expect to see additional services offerings from the large cloud-platforms either through internal innovation or mergers and acquisitions.

So far any wider rationalisation of tooling and processes has generally been offset by the increasing complexity demanded of data and tooling and the pace of such change. This is evident in regards to MLOps and DataOps tooling, both of which is presently typically handled with in-house tooling (Turck, 2020). A further brake on progress is due to the widely observed shortage of talent, which is expected to persist longer term (O’Shaughnessy, 2022), although to a certain extent, this may be mitigated with additional tooling.

Data quality testing and observability tools are currently seeing the greatest success in the market precisely due to where the market is in terms of overall ML maturity (Stiebellehner, 2022b). However, we appear to be entering a new phase of ML tooling and ML/AI more generally. Given the notable growth in startups dedicated to MLOps it can be expected that operationalisation will come to dominate in the near term (Shankar, 2021). However, with a general cooling of AI hiring (Huyen, 2020) and a general pull-back in VC funding (Turck, 2022), it appears a consolidation may be just round the corner. As for technological developments, I will leave discussion for the third blog post in this series, however I expect to see data-centric AI, MLOps, and AutoML as the key trends in ML toolings in the near future.

Wrap-up

In this blog post, we continued from our previous discussion of ML maturity in industry showing the link between the generally low level of sophisticated ML adoption and both the number of and incompleteness of ML tooling offerings. Furthermore, we hinted at some of the key trends in the near future: continued interest in data testing and observability, and increasing funding for MLOps startups. Following on from this, the next blog post in this series, will examine some of the key technologies emerging in the tooling landscape.

References

Felipe, A., & Maya, V. (2021). The State of MLOps.

Hewage, N. et al. (2022). Machine Learning Operations: A Survey on MLOps Tool Support.

Huyen, C. (2020, December 30). Machine Learning Tools Landscape v2 (+84 new tools). Chip Huyen. Retrieved May 7, 2022, from https://huyenchip.com/2020/12/30/mlops-v2.html

LF AI & Data Foundation. (2022). LF AI & Data Foundation Interactive Landscape. LF AI & Data Landscape. Retrieved May 6, 2022, from https://landscape.lfai.foundation/

Miranda, L. (2021a, May 15). Navigating the MLOps tooling landscape (Part 2: The Ecosystem). Lj Miranda. Retrieved May 6, 2022, from https://ljvmiranda921.github.io/notebook/2021/05/15/navigating-the-mlops-landscape-part-2/

Miranda, L. (2021b, May 30). Navigating the MLOps tooling landscape (Part 3: The Strategies). Lj Miranda. Retrieved May 6, 2022, from https://ljvmiranda921.github.io/notebook/2021/05/30/navigating-the-mlops-landscape-part-3/

O’Shaughnessy, P. (2022, April 12). Alexandr Wang — A Primer on AI. YouTube. Retrieved May 8, 2022, from https://open.spotify.com/episode/0jFd4L8nvDROu05lk2kv6y?si=06e4af52baff44be&nd=1

Ruf, P. et al. (2021). Demystifying MLOps and Presenting a Recipe for the Selection of Open-Source Tools.

Shankar, S. (2021, December 13). The Modern ML Monitoring Mess: Categorizing Post-Deployment Issues (2/4). Shreya Shankar. Retrieved May 8, 2022, from https://www.shreya-shankar.com/rethinking-ml-monitoring-2/

Stiebellehner, S. (2022a, February 27). [The MLOps Engineer] Vertical first, horizontal second. Why you should break through to production early when developing machine learning systems and how MLOps facilitates this. | by Simon Stiebellehner | Medium. Simon Stiebellehner. Retrieved May 7, 2022, from https://sistel.medium.com/the-mlops-engineer-vertical-first-horizontal-second-306fa7b7a80b

Stiebellehner, S. (2022b, April 10). [The MLOps Engineer] The “Datadogs” of tomorrow. How the Data Quality, Monitoring & Observability wave is building up. | by Simon Stiebellehner | Apr, 2022. ITNEXT. Retrieved May 9, 2022, from https://itnext.io/the-mlops-engineer-the-datadogs-of-tomorrow-614a88a374e0

thoughtworks. (2021). Guide to Evaluating MLOps Platforms November 2021.

Turck, M. (2016, February 1). Is Big Data Still a Thing? (The 2016 Big Data Landscape). Matt Turck. Retrieved May 4, 2022, from https://mattturck.com/big-data-landscape/

Turck, M. (2017, April 5). Firing on All Cylinders: The 2017 Big Data Landscape. Matt Turck. Retrieved May 4, 2022, from https://mattturck.com/bigdata2017/

Turck, M. (2020, October 21). The 2020 data and AI landscape. VentureBeat. Retrieved May 6, 2022, from https://venturebeat.com/2020/10/21/the-2020-data-and-ai-landscape/

Turck, M. (2021, September 28). Red Hot: The 2021 Machine Learning, AI and Data (MAD) Landscape. Matt Turck. Retrieved May 4, 2022, from https://mattturck.com/data2021/

Turck, M. (2022, April 28). The great VC pullback of 2022 — Matt Turck. Matt Turck. Retrieved May 8, 2022, from https://mattturck.com/vcpullback/

TWIML. (2021, June 16). Introducing TWIML’s New ML and AI Solutions Guide. TWIML. Retrieved May 6, 2022, from https://twimlai.com/solutions/introducing-twiml-ml-ai-solutions-guide/