Getting Started

Machine Learning a Systems Engineering Perspective

This article takes a holistic approach to a machine learning using elementary Systems Engineering principles. Enabling you to understand and manage the fundamental building block in most machine learning systems. You will learn how to get form data to predictions.

Daniel Patzer

Published in

Towards Data Science

5 min readJan 10, 2021

Figure 1: Machine Learning Systems Design

Table of Content

Systems Engineering principles
Machine Learning
Exploratory Data Analysis
Training Subsystem
Evaluation
Example

Systems Engineering principles

Systems engineering seeks to understand the big picture by breaking complex projects into manageable well-defined sub-systems. This article will leverage fundamental systems engineering principles to introduce Machine Learning as a system composed of interacting elements. The usage of terminology throughout this article is an elaboration of the fundamental idea that a system is a purposeful whole consisting of interacting parts. Each element that is part of these system is atomic (i.e., not further decomposable) in nature and modeled using descriptive features. This reduces the complexity and supports task independence by allowing management authorities to get a high-level perspective of a machine learning pipeline.

Machine Learning

In machine learning everything originates from a well-approximated representation of the real-world environment through data. The right data is therefore critical to the success of your applied machine learning model. Even though the term “right” is somewhat vague and completely task-dependent, some important considerations need to be made prior to the project initiate. First, do I own, or can I license the data? The more people have access to the same data stream, the more competition there is. I would only apply this strategy for commercial use-cases that underlie strict data protection laws. If the data is not personalized and restrictions do not apply, I would recommend open sourcing your data that everybody can contribute. Secondly consider; will I have access to a continuous stream of data? This enables continual improvements in model accuracy and generalization performance by retraining the model on the latest data samples. This entire process of gathering the right data is called data mining.

A follow up step is the data preprocessing. It is crucial to enhance data quality and promote the extraction of meaningful insights in order to create high-quality machine learning systems. The process includes, the removal of duplicates, normalization, and transformation of data points, plus feature extraction and selection. By then the data is usually organized in a tabular structured database. In our example, the input features, which can be a pixel of an image or speech frequencies, are organized with a corresponding label, or annotation. The aforementioned structure describes a standard supervised learning scenario which will leverage for illustration purposes in this article. Besides that, machine learning generally contains three types of algorithms; Supervised Learning, Unsupervised Learning, and Reinforcement Learning.

Figure 2: Subclasses within Machine Learning

Exploratory Data Analysis

Before starting the development, you might only have a vague strategy or goal in mind on what your machine learning model should learn from the data. To foster your ideas Exploratory Data Analysis (EAD) can act as an exploratory stage to gain rudimentary interpretations of the data. Summarizing main characteristics by plotting them visually helps to understand the data set. This is mostly done by dimensionally reducing the data to extrapolate the main features and trends. Furthermore, EDA is an essential tool for exploratory research and outlining the scope. Thus, allowing project managers to identify system requirements.

Training Subsystem

The training subsystem can be divided into three stages; choosing a learning algorithm, optimizing the hyperparameter, and training the model on the data. At the first stage — choosing a learning algorithm — there are traditional machine learning algorithms that have been around for decades, such as Random Forrest (RF) or Support Vector Machines (SVM). However, this series of posts will mainly focus on algorithms based on neural architectures. The two types of algorithms, that I find most promising, are Deep Learning (DL) and Deep Reinforcement Learning (DRL) since both scale exceptionally well with GPU computation and the amount of data.

After choosing the learning algorithm and model architecture, a second step is the hyperparameter optimization. Hyper-parameters are parameters that are not directly learned within the model. They are used to configure either the model (e.g. amount of decision trees and their depth or number of layers of a deep neural network, etc.) or the cost function (learning rate for gradient descent algorithm, etc.). During training, the algorithm iterates in multiple batches over the dataset. An epoch is defined when the model was exposed to every data point in the dataset once to update internal model parameters (e.g. weights) accordingly.

Evaluation

This iteration continues for multiple epochs until it either reaches a certain performance score or the target number of epochs. While training a model is a key step, measuring the model generalization performance on unseen data is an equally important aspect, that should be considered in every machine learning pipeline. Generalization is the ability to perform well on unobserved samples (new data). The main challenge in machine learning is to reduce the generalization error. It can be approximated using Cross-Validation. It delivers a framework where 80% of the dataset is exclusively used for training and the remaining 20% for testing and validation, respectively. The model’s performance scores are continuously calculated after each epoch.

Example

Assume, we want to develop an app to automatically diagnose melanoma skin cancer. A convenient patient side solution would be to take a foto of the affected skin and upload it to the app. The backend machine learning model then should predict the probability of melanoma skin cancer given the image. To develop the following system, we can follow the aforementioned steps. The dataset we leverage to train the model is the SIIM-ISIC melanoma skin cancer set. It contains 44.000 labeled images for melanoma skin cancer. Any preprocessing and transformations steps have already been done. We can use a ResNeXt-101 model architecture introduced by FAIR (Facebook) in 2016. The architecture contains ~44 million parameters that will be jointly learned on the 35.000 training images. During training, the model internally tries to minimize a loss function, for example a cross-entropy loss, which updates the parameters accordingly. The performance scores, such as error (precision/recall), accuracy, MSE, F1-score, approximate the accuracy and generalization capabilities evaluated on the 9.000 unseen images/labels. After the model passes the verification phase, it can then be used for online experimentation inference on the uploaded images. We will not go into detail on further steps we would need to proceed until we could ship the project, since we are only focusing on the technical implementation here.

Now, that we gained a basic understanding on the technical building blocks involved. We will use the aforementioned modeling techniques to design a production-ready machine learning system in the next post. Which involves further components such as, project setup, data pipeline, and detailed elaborations on serving, including testing, deploying, and maintaining.