Portable Computer Vision: TensorFlow 2.0 on a Raspberry Pi

Tiny, low-cost object detection and classification.

Towards Data Science
8 min readJun 24, 2019

Part 1 — Introduction

For roughly $100 USD, you can add deep learning to an embedded system or your next internet-of-things project.

Are you just getting started with machine/deep learning, TensorFlow, or Raspberry Pi? Perfect, this blog series is for you!

In this series, I will show you how to:

  1. Deploy a pre-trained image classification model (MobileNetV2) using TensorFlow 2.0 and Keras.
  2. Convert a model to TensorFlow Lite, a model format optimized for embedded and mobile devices.
  3. Accelerate inferences of any TensorFlow Lite model with Coral’s USB Edge TPU Accelerator and Edge TPU Compiler.
  4. Employ transfer learning to re-train MobileNetV2 with a custom image classifier.

The first post in this series (you’re reading it now!) will walk you through build materials, installation, and deploying MobileNetV2 to your Raspberry Pi.

Terms & References 📚

Raspberry Pi — a small, affordable computer popular with educators, hardware hobbyists, and roboticists. 🤖

TensorFlow — an open-source platform for machine learning.

TensorFlow Lite — a lightweight library for deploying TensorFlow models on mobile and embedded devices.

Convolutional Neural Network — a type of deep-learning model well-suited for image classification and object detection applications.

MobileNetV2 a state-of-the-art image recognition model optimized for performance on modest mobile phone processors.

Comparison of general-purpose computer vision neural networks. Image Credit: MobileNetV2: The Next Generation of On-Device Computer Vision Networks

Edge TPU — a tensor processing unit (TPU) is an integrated circuit for accelerating computations performed by TensorFlow. The Edge TPU was developed with a small footprint, for mobile and embedded devices “at the edge”

TPUv1, TPUv2 (left, middle) at Cloud Next ‘18. Edge TPUs on a United States penny (right). Image credit: Google

Part 2 — Build List ✅

Starter Kit

If you’re just getting started with Raspberry Pi, I recommend the Pi Camera Pack ($90) by Arrow. It includes everything you need begin immediately:

  • 5V 2.4A MicroUSB Power Supply
  • 320x240 2.8" TFT Model PiTFT Resistive Touch-screen
  • Raspberry Pi 3 Model B
  • Raspberry Pi Camera v2
  • Plastic Case
  • 8GB MicroSD Card with NOOBS installation manager pre-loaded

Coral USB Edge TPU Accelerator (Optional)

You can compile TensorFlow Lite models to run on Coral’s USB Accelerator (Link), for quicker model predictions.

Real-time applications benefit significantly from this speed-up. An example would be the decision-making module of an autonomous self-driving robot.

Some applications can tolerate a higher prediction speed and might not require TPU acceleration. For example, you would not need TPU acceleration to build a smart doggie door that unlocks for your pooch (but keeps raccoons out).

If you’re just getting started, skip buying this component.

Are you not sure if you need the USB Accelerator? The MobileNet benchmarks below might help you decide. The measurements below depict inference speed (in ms) — lower speeds are better!

Image Credit: Alasdair Allan, Benchmarking TensorFlow and TensorFlow Lite on the Raspberry Pi

Custom Build

If you already have a Raspberry Pi or some components laying around, the starter kit might include items you don’t need.

Here are the parts I used for my own builds (approximately $250 / unit).

  • Raspberry Pi Model 3 B+ ($35)
  • Raspberry Pi Camera v2 ($30)
  • Coral USB Edge TPU Accelerator — accelerates model inferencing ($75, link)
  • Pi Foundation Display — 7" Touchscreen Display ($80, link)
  • SmartiPi Touch Stand ($25, link)
  • Adjustable Pi Camera Mount ($5, link)
  • Flex cable for RPi Camera 24'’ ($3, link)

I would love to hear about your own build list! ❤️ Tweet me @grepLeigh or comment below.

Part 3— Raspberry Pi Setup 🍰

If you purchased an SD card pre-loaded with NOOBS, I recommend walking through this overview first: Setting up your Raspberry Pi

Before proceeding, you’ll want to:

  • Connect your Pi to the internet (doc)
  • SSH into your Raspberry Pi (doc)

Part 4— Primary Computer: Download & Install Dependencies

rpi-vision is a set of tools that makes it easier for you to:

  • Install a lot of dependencies on your Raspberry Pi (TensorFlow Lite, TFT touch screen drivers, tools for copying PiCamera frame buffer to a TFT touch screen).
  • Deploy models to a Raspberry Pi.
  • Train new models on your computer or Google Cloud’s AI Platform.
  • Compile 8-bit quantized models for an Edge TPU.
  1. Clone the rpi-vision repo on your primary computer (not your Raspberry Pi)
$ git clone git@github.com:leigh-johnson/rpi-vision.git && cd rpi-vision

2. On your primary computer, create a new virtual environment, then install the rpi-vision package.

$ pip install virtualenv; virtualenv -p $(which python3) .venv && source .venv/bin/activate && pip install -e .

3. Verify you can SSH into your Raspberry Pi before proceeding.

If you’re using the default Raspbian image, your Pi’s hostname will beraspberrypi.local

$ ssh pi@raspberry.local

Part 5— Primary Computer: create configuration files

rpi-vision uses Ansible to manage deployments and tasks on your Raspberry Pi. Ansible is a framework for automating the configuration of computers.

Create 2 configuration files required by Ansible:

.env/my-inventory.ini

If you’re using a custom hostname for your Pi, replace raspberrypi.local.

tee -a .env/my-inventory.ini <<EOF
[rpi_vision]
raspberrypi.local
[rpi_vision:vars]
ansible_connection=ssh
ansible_user=pi
ansible_python=/usr/bin/python3
EOF

.env/my-vars.json

If you’re using a custom hostname for your Pi, replace raspberrypi.local.

tee -a .env/my-vars.ini <<EOF
{
"RPI_HOSTNAME": "raspberrypi.local",
"VERSION": "release-v1.0.0"
}
EOF

Part 6— Raspberry Pi: Install Dependencies

$ make rpi-install

You’ll see the output of an Ansible playbook. Ansible is a framework for automating the configuration of computers.

A quick summary of what’s being installed on your Pi:

  • rpi-vision repo
  • rpi-fbcp (a tool for copying framebuffer from PiCamera to TFT touch screen display)
  • TFT touch screen drivers and X11 configuration

You can inspect the tasks run on your Raspberry Pi by opening playbooks/bootstrap-rpi.yml

While the installation is running, read through the next section to learn how CNNS work and why they are useful for computer vision tasks.

Part 7— Introduction to CNNs (convolutional neural networks)

CNNs are the key technology powering self-driving cars and image search engines. The technology is common for computer vision, but can also be applied to any problem with a hierarchical pattern in the data, where a complex pattern can be assembled from simpler patterns.

Modeling the Visual Cortex

In the late 1950s and 1960s, David H. Hubel and Torton Wielson performed experiments on cats and monkeys to better understand the visual cortex.

Diagram of the implant installed in the skulls of cats with a trephine. Image Credit: SINGLE UNIT ACTIVITY IN STRIATE CORTEX OF UNRESTRAINED CATS

They demonstrated neurons in the striate cortex respond to stimuli in a limited visual field, which they called a receptive field.

They noted concentric overlapping responses, where complex patterns were combinations of lower-level patterns.

Their findings also revealed specialization, where some neurons would only respond to a specific shape or pattern.

In the 1980s, inspired by Hubel and Wielson, Kunihiko Fukushima published on the neocognitron, a neural network capable of learning patterns with geometrical similarity.

Diagram of a Neocogitron, the foundation for modern CNNS. Image Credit: Neocognitron: A Self-organizing Neural Network Model for a Mechanism of Pattern Recognition Unaffected by Shift in Position

The neocogitron has two key properties:

  • Learned patterns are hierarchal. Increasingly complex patterns are composed from simpler patterns.
  • Learned patterns are position-invariant and translation-invariant. After the network learns a pattern, it can recognize the pattern at different locations. After learning how to classify a dog, the network can accurately classify an upside-down dog without learning an entirely new pattern.

The neocogitron model is the inspiration for modern convolutional neural networks.

Visualizing a Convolution Operation: 2D

(Left) 2D 4x4 Input matrix. (Middle) 2D 2x2 kernel. (Right) 2D 2x2 output feature map.

The input layer is fed into convolutional layers, which transform regions of the input using a filter.

The filter is also referred to as a kernel.

The filter “slides” to each possible position, and the result is added to the feature map.

For each position in the input matrix, the convolution operation performs matrix multiplication on each element.

The resulting matrix is summed and stored in a feature map.

The operation is repeated for each position in the input matrix.

Visualizing a Convolution Operation: 3D

Image Credit: Applied Deep Learning — Part 4: Convolutional Neural Networks

The input layer of a CNN is usually a 3D data structure with height, width, and channel (RGB or greyscale values).

The deeper we go in the feature map stack, the sparser each map layer becomes. That means the filters detect fewer features.

The first few layers of the feature map stack detect simple edges and shapes, and look similar to the input image. As we go deeper into a feature map stack, features become more abstract to the human eye. Deeper feature layers encode classification data, like “cat face” or “cat ear”.

Comparison of feature maps from the first convolution layer (block1_conv1) with later layers (block5_conv1). Image Credit: Applied Deep Learning — Part 4: Convolutional Neural Networks

Do you want to learn more about CNNS?

Your dependency installation is probably done by now. To forge ahead, skip to Part 8 - Deploy Pre-trained Model MobileNetV2.

If you plan on training a custom classifier or want to read more about convolutional neural networks, start here:

Part 8 — Deploy Pre-trained Model (MobileNetV2)

Live Demo (using TensorFlow 2.0)

I used this code to sanity-check the TensorFlow 2.0-beta0 wheel that I cross-compiled for my Raspberry Pi 3.
  1. SSH into your Raspberry Pi
$ ssh raspberrypi.local

2. Start a new tmux session

pi@raspberryi:~ $ tmux new-session -s mobilenetv2

3. Split the tmux session vertically by pressing control+b, then “

4. Start an fbcp process, which will copy framebuffer from the PiCamera to the TFT display via SPI interface. Leave this process running.

pi@raspberryi:~ $ fbcp

5. Switch tmux panes by pressing control+b, then o.

6. Activate the virtual environment installed in earlier, in Part 6.

pi@raspberryi:~ $ cd ~/rpi-vision && . .venv/bin/activate

7. Start a mobilenetv2 agent process. The agent will take roughly 60 seconds to initialize.

pi@raspberryi:~/rpi-vision $ python rpi_vision/agent/mobilenet_v2.py

You’ll see a summary of the model’s base, and then the agent will print inferences until stopped. Click for a gist of what you should see.

This demo uses weights for ImageNet classifiers, which you can look up at image-net.org.

Thank you for reading!

Congratulations, you just deployed an image classification model to your Raspberry Pi! ✨

Looking for more hands-on examples of Machine Learning for Raspberry Pi and other small device? Sign up for my newsletter!

I publish examples of real-world ML applications (with full source code) and nifty tricks like automating away the pains of bounding-box annotation.

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

Published in Towards Data Science

Your home for data science and AI. The world’s leading publication for data science, data analytics, data engineering, machine learning, and artificial intelligence professionals.

Written by Leigh Johnson

Building automation/monitoring for 3D printers: https://www.printnanny.ai/ | Applied ML and Edge ML: https://bitsy.ai/ | Google Developer Expert.

Responses (10)

How can I convert this to run on Coral Dev Board?

Well written thanks!

Very nice article. Thank you!