The AutoML Dilemma

An Infrastructure Engineer’s Perspective

Published in

Towards Data Science

8 min readSep 16, 2023

AutoML has been a hot topic for the past few years. The hype has been built up so high, even with the ambition to replace human machine learning experts. However, not seeing much adoption in a long while, the expectation for AutoML is dropping quickly, which strictly follows Gartner’s curve.

AutoML on Gartner’s curve (Image by the author)

At this point, we need to understand the current status of AutoML and figure out the way for tomorrow. I am a software engineer who developed two AutoML libraries, AutoKeras and KerasTuner. In this article, I will help you review what AutoML is and what are the missing pieces that prevented AutoML from mass adoption.

What is AutoML?

Imagine someone with limited machine learning expertise facing a real-world image classification problem. They can clearly define the problem and have the training data available. AutoML can help to build a trained machine learning model in this case.

From an input and output perspective, AutoML does the following.

AutoML from an input and output perspective (Image by the author)

It takes in the problem definition and training data and outputs a trained machine learning model ready for deployment. For example, if given an image classification task, it takes in the training image dataset as input and outputs a trained image classification model.

The steps AutoML tries to automate may include data preprocessing, featurization, model selection, hyperparameter tuning, neural architecture search, model training, inferencing on testing data, and data post-processing.

In summary, automated machine learning (AutoML) tries to bridge the gap between the various fancy machine learning models and training techniques available today and the real-world problems they could solve by providing end-to-end solutions in an automated way.

How does AutoML work?

For a given task and dataset, the AutoML system would efficiently try out a series of relevant methods or models and pick the best one for you.

You can think of it as a for loop containing the following steps:

Generate a model configuration.
Create and train the model with the configuration.
Evaluate the model on validation data.
Learn from the evaluation results to improve the configuration.

A smart agent in the AutoML system generates the configurations and improves over time by learning the evaluation results.

Many algorithms could be used as smart agents, for example, Bayesian optimization or reinforcement learning. However, at the core of the smart agent, what it does is function approximation and function maximization. Let’s see them one by one.

Function approximation. The smart agent is trying to learn the relation between the model configurations and the model performance. In math language, it is trying to learn a function y=f(x), where x is the model configuration, and y is the model’s performance.
Function maximization. The end goal of the smart agent is to find a model configuration with the best model performance. In other words, we want to find the x that maximizes the value of f(x), i.e., argmax f(x).

The impact of AutoML

As you can imagine, the impact of AutoML is huge if widely adopted. It can dramatically increase the productivity of machine learning practitioners. They no longer need to spend a lot of time fine-tuning the details of the model configurations. They may only need to carefully define the task and manually constrain the search space to get the result faster.

What can AutoML do today?

The applications of AutoML today are quite limited, mainly focusing on the following two aspects.

Quick tryouts. Some machine learning engineers may want to quickly try machine learning on their tasks and datasets. They can use AutoML as a starting point. They can further develop the ML solution by hand if they achieve comparatively good results.
ML education. The students who just started learning ML may use AutoML to understand what ML can do. They do not need to touch all the details of the ML solution but get a quick overview of the process.

What can AutoML do in the future?

The expectation of what AutoML can do in the future is much higher than it can today. We summarize it into three main goals as follows.

For ML experts: Boost the productivity of data scientists and machine learning engineers.
For domain experts: Domain experts, like medical doctors or mechanical engineers, can easily apply AutoML to their problems.
For production engineers: The found solution can be easily deployed for production.

The problems of AutoML

We learned where we are now and where we are going with AutoML. The question is how we are getting there. We summarize the problems we face today into three categories. When these problems are solved, AutoML will reach mass adoption.

Problem 1: Lack of business incentives

Modeling is trivial compared with developing a usable machine learning solution, which may include but is not limited to data collection, cleaning, verification, model deployment, and monitoring. For any company that can afford to hire people to do all these steps, the cost overhead of hiring machine learning experts to do the modeling is trivial. When they can build a team of experts without much cost overhead, they do not bother experimenting with new techniques like AutoML.

So, people would only start to use AutoML when the costs of all other steps are reduced to the bottom. That is when the cost of hiring people for modeling becomes significant. Now, let’s see our roadmap towards this.

Many steps can be automated. We should be optimistic that as the cloud services evolve, many steps in developing a machine learning solution could be automated, like data verification, monitoring, and serving. However, there is one crucial step that can never be automated, which is data labeling. Unless machines can teach themselves, humans will always need to prepare the data for machines to learn.

Data labeling may become the main cost of developing an ML solution at the end of the day. If we can reduce the cost of data labeling, they would have the business incentive to use AutoML to remove the modeling cost, which would be the only cost of developing an ML solution.

The long-term solution: Unfortunately, the ultimate solution to reduce the cost of data labeling does not exist today. We will rely on future research breakthroughs on “learning with small data”. One possible path is to invest in transfer learning.

However, people are not interested in working on transfer learning because it is hard to publish on this topic. For more details, you can watch this video, Why most machine learning research is useless.

The short-term solution: In the short-term, we can just fine-tune the pretrained large models with small data, which is a simple way of transfer learning and learning with small data.

In summary, with most of the steps in developing an ML solution automated by cloud services, and AutoML can use pretrained models to learn from smaller datasets to reduce the data labeling cost, there will be business incentives to apply AutoML to reduce their cost in ML modeling.

Problem 2: Lack of maintainability

All deep learning models are not reliable. The behavior of the model is unpredictable sometimes. It is hard to understand why the model gives specific outputs.

Engineers maintain the models. Today, we need an engineer to diagnose and fix the model when problems occur. The company communicates with the engineers for anything they want to change for the deep learning model.

The AutoML system is much harder to interact with than an engineer. Today, you can only use it as a one-shot method to create the deep learning model by giving the AutoML system a series of objectives clearly defined in math in advance. If you encounter any problem using the model in practice, it will not help you fix it.

The long-term solution: We need more research in HCI (Human-Computer Interaction). We need a more intuitive way to define the objectives so that the models created by AutoML are more reliable. We also need better ways to interact with the AutoML system to update the model to meet new requirements or fix any problems without spending too much resources searching all the different models again.

The short-term solution: Support more objective types, like FLOPS and the number of parameters to limit the model size and inferencing time, and weighted confusion matrix to deal with imbalanced data. When a problem occurs in the model, people can add a relevant objective to the AutoML system to let it generate a new model.

Problem 3: Lack of infrastructure support

When developing an AutoML system, we found some features we need from the deep learning frameworks that just do not exist today. Without these features, the power of the AutoML system is limited. They are summarized as follows.

First, state-of-the-art models with flexible unified APIs. To build an effective AutoML system, we need a large pool of state-of-the-art models to assemble the final solution. The model pool needs to be updated regularly and well-maintained. Moreover, the APIs to call the models need to be highly flexible and unified so we can call them programmatically from the AutoML system. They are used as building blocks to construct an end-to-end ML solution.

To solve this problem, we developed KerasCV and KerasNLP, domain-specific libraries for computer vision and natural language processing tasks built upon Keras. They wrap the state-of-the-art models into simple, clean, yet flexible APIs, which meet the requirements of an AutoML system.

Second, automatic hardware placement of the models. The AutoML system may need to build and train large models distributed across multiple GPUs on multiple machines. An AutoML system should be runnable on any given amount of computing resources, which requires it to dynamically decide how to distribute the model (model parallelism) or the training data (data parallelism) for the given hardware.

Surprisingly and unfortunately, none of the deep learning frameworks today can automatically distribute a model on multiple GPUs. You will have to explicitly specify the GPU allocation for each tensor. When the hardware environment changes, for example, the number of GPUs is reduced, your model code may no longer work.

I do not see a clear solution for this problem yet. We must allow some time for the deep learning frameworks to evolve. Some day, the model definition code will be independent from the code for tensor hardware placement.

Third, the ease of deployment of the models. Any model produced by the AutoML system may need to be deployed down the stream to the cloud services, end devices, etc. Suppose you still need to hire an engineer to reimplement the model for specific hardware before deployment, which is most likely the case today. Why don’t you just use the same engineer to implement the model in the first place instead of using an AutoML system?

People are working on this deployment problem today. For example, Modular created a unified format for all models and integrated all the major hardware providers and deep learning frameworks into this representation. When a model is implemented with a deep learning framework, it can be exported to this format and become deployable to the hardware supporting it.

Conclusions

With all the problems we discussed, I am still confident in AutoML in the long run. I believe they will be solved eventually because automation and efficiency are the future of deep learning development. Though AutoML has not been massively adopted today, it will be as long as the ML revolution continues.