Notes from Industry
A lot of problems in modern global industries, such as H&M, Ikea, GAP, McDonald’s, etc., are better solved by Machine Learning (ML) solutions. Even a single-digit percentage improvement in the supply chain, marketing, sales, etc., accounts for a very large volume of savings in costs or uplifts in revenues. ML-based product development in a global company is complex, which often needs to be repeated many many times to optimize tens to hundreds of business processes spread across many departments influencing millions of customers all over the world. While it is possible to roll out five cases using random processes, it is almost impossible to do so tens to hundreds times than that without having some process in place.

On a high level, ML-based product development has four different stages (see Figure 1). First, a team is bootstrapping when it explores the problem and the solution approaches. Second, the team is prototyping when it implements a solution to test the feasibility of the approach. Third, the team is industrializing when it optimizes the solution for systematic delivery in a simple setting. Finally, the team is scaling when it improves the solution for delivery in a more complex, large-scale setting. While the core design of the solution in a stage transfers easily to the next stage, the details of implementation may change drastically. Therefore, the execution strategy and the team implementing the strategy should change unless the knowledge of the domain should be deeply ingrained in the team, which should be the case for the stakeholders but not be the case for the technical team.
While it may be tempting to go to industrializing from bootstrapping, it is a bad idea to skip prototyping. Prototyping should show that solving an idea is infeasible for different reasons, such as lack of sufficient data, lack of good validation mechanism, etc., which helps an organization to prioritize what to industrialize. Furthermore, it figures out the building block of the core solution, which helps to prioritize optimization efforts during industrialization. Therefore, it is important that the organization focuses on how to do prototyping systematically, which relies on following a few easy-to-follow methods supported by technical solutions that can be repeated day in and out. In this article, we present a systematic approach to prototype machine learning solutions.
An Approach To Machine Learning Prototypes
Design Documents
The design documents are meant to help the team to plan, execute, and communicate with the stakeholders effectively. A team should start with two simple design documents:
- One-pager
- System diagram
A one-pager includes information about the formulation of the business problem and evaluation criteria of the solution. A one-pager includes the following:
- Business case definition
- Stakeholder and resources
- Performance metrics and validation criteria
- Solution optimization metrics and validation criteria
- Prototype deadline and milestones
A system diagram contains key elements of technical development and help the team plan the development efforts efficiently. A system diagram includes the following:
- Key data tables
- Pipelines showing key algorithmic steps and input and output artifacts
- Key infrastructure components
In the beginning, it is fine to leave the components that are unknown, which should be filled as the team make progress. We recommend using the documents in planning discussions and demos.
Agile Ways of Working
Agile ways of working in terms of adopting the agile principle is a must. It matters less how the agile routines are performed, but principles hold through some sort of formal/informal routines.
That basically means stakeholders must be involved from the very beginning and should consult frequently to guide the development in the right direction. Since the nature of the work is exploratory, it is good to keep the requirements a bit flexible and adopt a kanban style workflow than a scrum one.
Pipeline Development

As shown in Figure 2, it is important to think about the prototype solution in form of three separate dependent pipelines: feature generation, model training, and model inference. The feature generation pipeline takes the input data tables and generates a feature table of a specific size using the relevant data boundaries, e.g., dates, geographic locations, etc. The pipeline should be used not just for generating features for model training, but also for inferences. The model training pipeline executes steps on training features to generate a model and its performance dashboard. The model inference pipeline executes steps on inference features to generate forecasted targets and its performance dashboard.
While it is tempting to implement one notebook to rule them all, it is a bad idea. Often during solution exploration, certain steps needs iterate more times than the others. For example, finding a good algorithm for training may take more iteration than the feature generation. Keeping the pipeline separate for different types of activities makes it easy and efficient to iterate.
Each pipeline should be designed as a directed acyclic graph of tasks if executing the steps in the pipeline takes long and it is beneficial to restart dag from a certain task.
Process Improvement
Choose metrics to improve the prototyping process. Two metrics that we would recommend adopting are the delay in stakeholder acceptance and handover duration. The focus on the former should enable the development of reusable components, such as standardization of performance metrics, libraries for the pipelines, hypothesis testing mechanisms, etc. The focus on the latter should enable the adoption of Best Practices, such as code/system testing and validation, documentation, archiving, etc.
Archive The Journey
Archive the code and input/output data for testing and validation of every work. Furthermore, for each work, maintain a journal of major learnings, i.e., achievements, pitfalls, and mistakes. The code (and data) will help to reuse some of the solutions with minor or no modifications. The journal will help with future work planning and prioritization.
Remark
I have a few words of caution.
First, it may be tempting to do a few nice things, e.g., maintaining high-quality code. However, we recommend keeping the eye on the ball and stay focus on the team’s strength, i. e., testing an idea. Optimizing early is a mistake and not a responsibility of the team.
Second, one needs to learn how to walk before running. I recommend any teams to do things as they see fit the first three/four times. Patterns should start emerging through those experiences, which should inspire reusable designs.
Finally, expect deviations as every team’s journey is different. The approach shared in this article or somewhere else is meant to inspire a specific way of thinking. However, the implementation reality would be unique to each team.
Most importantly, keep prototyping.