From Data Science to Carbon Footprint Compliance: Discover the World of TinyML

Using TensorFlow Lite Micro

Rafael Tappe Maestro

Published in

Towards Data Science

9 min readNov 9, 2022

Figure by itemis AG, https://info.itemis.com/iot-systems/download-tinyml-rock-paper-scissors-dataset/ (with permission)

Co-authored by Nikolas Rieder and Rafael Tappe Maestro.

Abstract

You want to expand your field of expertise in artificial intelligence and want to explore a more energy efficient approach to machine learning (ML). This article will introduce you to ML on microcontrollers (MCUs), also known as tiny machine learning (TinyML). Get ready for losing against a MCU at rock, paper, scissors. You will cover data collection and preprocessing, model design and how to get your model running on the MCU. Our example provides you with all you need to do your own TinyML project, from start to end.

What’s a MCU

MCU stands for microcontroller unit. The MCU has a weak processor connected to a variety of input and output pins. Contrary to your computer, a MCU doesn’t have a full blown operating system and only handles a few processes at a time. The MCU also features a small amount of memory and RAM.

For our project, the ESP-EYE board from Espressif is ideal. This board bundles the ESP32 microcontroller with a camera which is already connected, saving us some work.

Why should I care about TinyML?

Tech companies such as DeepMind and OpenAI dominate the ML domain with experts and GPU power. Especially models in natural language processing (NLP) such as GPT-3 need days for a full training cycle using dozens of high performance GPUs, in parallel. This goes hand in hand with high energy consumption, and as NLP models grow further each year, so does their need for energy. The costs of development are multiple times higher than that of a single training cycle, with additional compute time needed for hyper parameter optimization. TinyML turns the tables somewhat by going small. Because of memory limitations, large AI models don’t fit onto microcontrollers. The figure below shows the disparity between the hardware requirements.

Figure by itemis AG, https://app.hubspot.com/documents/761475/view/277264732?accessId=dc8033

What advantages does ML on MCUs offer over using AI services in the cloud? We find seven main reasons you may want to use TinyML instead.

Cost
Microcontrollers are cheap to buy and run.

Environmentally friendly
Running AI on a microcontroller consumes little energy.

Integration
Microcontrollers are easily integrated into existing environments, for example production lines.

Privacy and security
Data can be processed locally, on-device. Data doesn’t have to be sent through the internet.

Rapid prototyping
TinyML enables you to develop proof-of-concept solutions in a short period of time.

Autonomous and reliable
Tiny devices can be used anywhere, even when there is no infrastructure.

Real-time
The data is processed on the microcontroller without latency. The only limitation is the processing speed of the MCU.

You may wonder how big these points are, particularly around energy consumption. Is local processing really such a big deal? Even a single Google search consumes energy equivalent to powering a 60 W light bulb for 17 seconds. The effectiveness of TinyML is exploited by voice assistants such as Siri, Google Assistant and Alexa where wake word recognition takes place locally across the board. More recently, the dictation function on iOS transcribes speech to text locally, too. That’s TinyML in action. Looking at the bigger picture, TinyML is inevitable as more smart devices come online. The amount of data produced, shared, consumed or stored keeps growing exponentially. In 2010 it amounted to 2 Zettabytes (ZB), a number expected to grow to 181 Zettabytes by 2025. TinyML fits right in among global efforts towards sustainability.

Rock Paper Scissors

Have you ever lost at rock, paper, scissors against an AI? Or do you want to impress your friends by defeating an AI? You will play against the ESP-EYE board using TinyML. This use case gives you a good overview of TinyML’s capabilities. The developed model uses convolutional layers, so you’ll learn about the intricacies of using those with TensorFlow Lite Micro. You’ll be able to adapt our project to recognize your cat, too! This project will also show the limitations of TinyML in terms of accuracy. There are five steps you need to take to make your project possible. The following sections provide a high level overview of the necessary steps. If you want to have a closer look, see the documentation in our project repository. It explains the nifty details.

Gather data

Collecting good data is a crucial part of ML. To get things running you need to take images of your hand forming rock, paper, scissors gestures. The more unique images the better. The AI will learn that your hand can be at different angles, positions or that lighting varies. A dataset contains the recorded images and for each image a label.

It’s best to use the same sensors and environment for running your AI that were also used to train it. This will ensure that the model is familiar with the incoming data. For example, think of temperature sensors which have different voltage outputs for the same temperature, due to manufacturing differences. For our purpose, this means that recording images with the ESP-EYE camera on a uniform background is ideal. During deployment, the AI will work best on a similar background. You could also record images with a webcam, at the potential cost of some accuracy. Due to the limited capacity of the MCU, we record and process grayscale images of 96×96 pixels. After collecting data, we split the data into a training and a test set.

Following the steps above, your collected images should look somewhat like this. If you don’t want to collect data now, you can download our ready-made dataset here.

Preprocess data

In our dataset, we recorded images using both, the ESP-EYE and a webcam. Since the ESP-EYE can capture grayscale images with 96×96 resolution, we don’t need further processing here. However, we needed to downsize and crop the webcam images to 96×96 pixels and convert them from RGB to grayscale format. Lastly, we normalize all images. Below, you see the intermediate steps of our processing.

Design a model

Since we’re dealing with image processing, convolutional layers are essential for our model. Because memory on the MCU is quite limited, we cannot use deep architectures. Even models designed for edge devices such as MobileNet and its successors are too large. Therefore, our model relies on three convolutional layer blocks which include convolution, batch normalization, and max pooling. The convolutional layer is regularized with L2 and L1 regularization to reduce overfitting. As is common, we use a ReLU activation.

A dense layer of 9 neurons is used after the third convolutional block. Finally, follows a dense softmax layer.

Hyperparameters

We tuned the hyperparameters using random search. Compared to other hyperparameter optimization methods, random search is easy to implement, understand, parallelize and search can be stopped and resumed at any time. Using random search, we trained 100 models for 20 epochs, each, for a wall-clock duration of 10 hours. An interesting result was that SGD optimization was superior to Adam. We know, blasphemy. Search was additionally applied to regularization parameters, number of convolutional filters, kernel size and stride, as well as the learning rate.

Convert a model

After training our model, we obtain an AI model in the TensorFlow format. Because the ESP-EYE cannot interpret this format, we change the model into a microprocessor readable format. We start with the conversion into a TfLite model. TfLite is a more compact TensorFlow format, which causes the model to decrease in size using quantization. TfLite is commonly used in edge devices, such as smartphones or tablets, all around the world. The last step is to convert the TfLite model to a C array, as the microcontroller is not capable of interpreting TfLite directly.

Embedded environment

In this section, we will discuss our code running on the MCU. You may know that microcontrollers are typically programmed in C/C++. Well, there is MicroPython, but that’s another story. Before you move on, you should have at least a basic knowledge of C/C++. There are tons of useful tutorials and beginner guides out there to learn about what some call the mother of all programming languages.

Let’s see what happens on the MCU. In the figure below, you can see the structure of our code. We start by reading raw data from the camera sensor, which we can later feed to the model. Then we apply the same preprocessing that was used on our training data to the incoming raw data. After that, we pipe the preprocessed data to the main function, within this function the prediction is done by the TensorFlow Lite Micro library. Since we use the softmax layer, the maximum probability is the predicted class for a given image. To increase accuracy, we take 5 images within a short duration, to make an ensemble prediction. The last step is for the model to make its own move.

To fully understand what’s happening on the C/C++ side, we recommend that you take a look at the code. For that, we want to point you in the right direction. As C++ programs start with a main.cpp file, this is where you might expect everything to come together. However, you should be looking at main_functions.cpp with its loop function. This function is executed in an infinite loop, also known as super loop, continuously repeating the steps in the diagram above.

Deploy model

Now we can deploy our model onto the microprocessor. Before we build (compile) and flash the C++ program, we need to place the new C array which encodes our model into the intended file, micro_model.cpp. Replace the content of the C array and don’t forget to replace the array length variable, micro_model_len. We’ve provided the script model_to_mcu.py to make this easier. And that’s it!

Conclusion

With this article, we hope to have brought you a new perspective on machine learning. Big Data, cloud deployments and expensive training aren’t the only way forwards for data science. TinyML reduces AI model’s carbon footprint and builds a bridge into the world of embedded systems and the internet of things. Computing at the edge inspires us to come up with AI deployments that feel more natural compared to their cloud or desktop PC counterparts. TinyML isn’t without its challenges, consider for example the prevalence of C and C++ and the feature incompleteness of TensorFlow Lite Micro compared to classic TensorFlow.

Expand the example

How about a challenge? A new purpose in life? Want to impress old friends or find new ones? Take rock, paper, scissors to the next level by adding lizard and spock. Your AI friend will be one skill closer to world domination. Well, first you should have a look at our rock, paper, scissors repository and be able to replicate the steps above. The README files will help you with the details. The figure below explains the rules of the extended game. You need to add two extra gestures and a few new winning and losing conditions.

Photo by itemis AG, https://info.itemis.com/iot-systems/download-tinyml-rock-paper-scissors-dataset/

Start your own project

If you liked this article and want to start your own projects we provide you with a project template which uses the same simple pipeline as our rock, paper, scissors project. You can find the template here. Don’t hesitate to show us your projects via social media. We are curious to see what you are able to create!

You can find out more about TinyML here and there. Pete Warden’s book is a great resource.

References

Energy cost of a Google search. Accessed 26–10–2022.
https://business.directenergy.com/blog/2017/november/powering-a-google-search
Growth of created data. Accessed 26–10–2022.
https://www.statista.com/statistics/871513/worldwide-data-created/