Robot-tank with Raspberry Pi and Intel Neural Computer Stick 2

Published in

Towards Data Science

6 min readOct 8, 2019

In my previous article, I did a road image segmentation via OpenCV-DNN and Enet.

That experiment failed because of the performance: a segmentation process turned out to be too much heavy for Raspberry.

There were two ideas to work the problem around:

teach Enet will smaller pictures in the hope it will be faster
run the segmentation on some hardware for neural networks

The second idea seemed more interesting and a few days after I got Intel Neural Computer Stick 2.

It is pretty big and that was not easy to put the module into the robot layout.

Due to the size, it does not fit Raspberry lower USB slots. Considering the left slots are behind the camera stand and then unavailable, the only way is to insert NCS into the upper right slot. GPS module already was there so it was extended with a cable and moved into the right lower slot.

Intel NCS

Intel recently released the 2nd version of their Neural Computing Stick with new API which turned out to be incompatible with the previous version.

The new API called Open Vino and includes OpenCV and some tools for neural networks.

There are some introductory articles about NCS2 and Open Vino framework:

It is easy to start working with NCS. Intel supports Raspbian out of the box, so there were no issues about the installation.

Then it is turned out NCS supports only its own format of neural networks. Open Vino contains a tool Model Optimizer to convert some formats. Some supported options are — Caffe, Tensorflow, PyTorch, etc.

Also Intel provides a set of pre-trained models for different applications in a dedicated model zoo.

There are two models for semantic segmentation:

Unfortunately the advanced model is unable to run on NCS.

Neural networks on NCS

There are a few steps to run an inference of a neural network on NCS.

Initialize device

The device name “MYRIAD”, the word “plugin” and the path to load the library — look like old-time artifacts.

from openvino.inference_engine import IENetwork, IEPlugin  ncs_plugin = IEPlugin(device="MYRIAD", plugin_dirs = "/opt/intel/openvino/inference_engine/lib/armv7l")

Load model

Then we need to load a neural network model to the device.

It is a heavy operation. The small model I used took about 15 seconds to be loaded.

The good news is it has to be done only once.

model = IENetwork(model=xml_path, weights=bin_path)         
net = ncs_plugin.load(network=model)

Run inference

Then we can run an inference and get an output tensor.

input_blob = next(iter(model.inputs))     
out_blob = next(iter(model.outputs))     
n, c, h, w = model.inputs[input_blob].shape     
images = np.ndarray(shape=(n, c, h, w))     
images[0] = image     
res = net.infer(inputs={input_blob: images})     
res = res[out_blob]

Single process problem

Suddenly it happened to be impossible using NCS from two different processes.

Model loading generates an error:

E: [ncAPI] [    684447] resetAll:348     Failed to connect to stalled device, rc: X_LINK_ERROR 
E: [ncAPI] [    691700] ncDeviceOpen:672        Failed to find suitable device, rc: X_LINK_DEVICE_NOT_FOUND 
Traceback (most recent call last):     
net = ncs_plugin.load(network=model)   
File "ie_api.pyx", line 395, in openvino.inference_engine.ie_api.IEPlugin.load   
File "ie_api.pyx", line 406, in openvino.inference_engine.ie_api.IEPlugin.load 
RuntimeError: Can not init USB device: NC_ERROR

A search found a similar problem at Intel support forum. The topic references to documentation which clearly states:

Single device cannot be shared across multiple processes.

Image segmentation with Open Vino

Open Vino already includes a ready-to-use model for semantic segmentation and samples.

The model works not so good as Enet did, but that also is not so bad.

Anyway, this implementation of Enet is in old Torch format, which is not supported by Open Vino Model Optimizer.

By the way, Open Vino model is not open — there are people asking, but it is recommended to use a similar PyTorch model and tune.

The performance competition: NCS + OpenVino segmentation vs Raspberry + Enet won by NCS with a great advantage: 0.8 seconds vs 6.

Directions decision-making

The tank (aka PiTanq) uses an image classifier to decide which way to go: left, right or straight. The details are described in the dedicated article.

The classifying network has been trained with Keras and works on Raspberry with Tensorflow (TF has an adapter for Keras format).

The model is very simple and shows good performance even on Raspberry: 0.35 seconds per image.

But, having NCS, we would expect some performance gain. So, we need to run a Keras model on NCS. Open Vino supports many different NN formats, there is Tensorflow, but no Keras.

There is a common task to convert Keras to TF and there are many sources available. I used this guide.

The same author has another article about how to run Keras on NCS with all the steps described.

Also Intel provides documentation for this case.

Finally I got the code based on all the sources:

import tensorflow as tf
from tensorflow.python.framework.graph_util import convert_variables_to_constantsfrom keras import backend as K
from keras.models import load_model
from keras.models import model_from_jsondef load_keras_model(json_file, model_file):
    jf = open(json_file, 'r')
    loaded_model_json = jf.read()
    jf.close()
    loaded_model = model_from_json(loaded_model_json)
    loaded_model.load_weights(model_file)
    return loaded_modeldef freeze_session(session, keep_var_names=None, output_names=None,     clear_devices=True):
    graph = session.graph
    with graph.as_default():
        freeze_var_names = list(set(v.op.name for v in tf.global_variables()).difference(keep_var_names or []))
        output_names = output_names or []
        output_names += [v.op.name for v in tf.global_variables()]
        input_graph_def = graph.as_graph_def()
        if clear_devices:
            for node in input_graph_def.node:
                node.device = ""
        frozen_graph = convert_variables_to_constants(session,    input_graph_def, output_names, freeze_var_names)
        return frozen_graphmodel = load_keras_model('./model.json', './model.h5')
frozen_graph = freeze_session(K.get_session(),
                              output_names=[out.op.name for out in model.outputs])tf.train.write_graph(frozen_graph, ".", "ktf_model.pb", as_text=False)

The same code is available on GitHub.

Now we have a TF model then convert it to Open Vino via Model Optimizer:

python mo_tf.py --input_model "model/ktf_model.pb" --log_level=DEBUG -b1 --data_type FP16

Benchmarks showed a significant difference: 0.007 seconds per image (instead of 0.35).

All the models (Keras, TF, Open Vino) are on GitHub.

Objects detection

Objects detection task is another feature of the robot. It was implemented with OpenCV-DNN and Mobile-SSD model.

Intel model zoo contains a lot of detectors with narrow specific based on Mobile-SSD but the whole model is missed.

But it is mentioned in the list of compatible Tensorflow models.

Using the previous experience of network conversion we generate Open Vino model of MobileSSD 2018_01_28.

That is interesting - if you try to open this version of MobileSSD with OpenCV-DNN, it fails:

cv2.error: OpenCV(4.1.0-openvino) /home/jenkins/workspace/OpenCV/OpenVINO/build/opencv/modules/dnn/src/tensorflow/tf_importer.cpp:530:  error: (-2:Unspecified error) Const input blob for weights not found in function 'getConstBlob'

But when we try to convert OpenCV-compatible version MobileSSD-11_06_2017 the conversion also fails:

[E0919 main.py:317] Unexpected exception happened during extracting attributes for node FeatureExtractor/MobilenetV1/Conv2d_13_pointwise_1_Conv2d_2_1x1_256/Relu6. Original exception message: operands could not be broadcast together with remapped shapes [original->remapped]: (0,) and  requested shape (1,0,10,256)

So far we need two different versions of MobileSSD if we want to use both Open Vino and OpenCV-DNN approaches.

Open Vino wins the benchmark: 0.1 seconds versus 1.7.

The new version of MobileSSD looks less stable than the previous one:

Images classification

Another ability of PiTanq is to classify images with Tensorflow and Inception on Imagenet.

The tank used a pretty old version on Inception — 2015–12–05. There was plenty of time since that moment and now there are four versions of Inception!

The goods news is all of them are supported by Open Vino.

The old version metrics on the previous picture are:

laptop, laptop computer 62%
notebook, notebook computer 11%
13 seconds
where is the cat?

Run the classifier on NCS:

laptop, laptop computer 85%
notebook, notebook computer 8%
0.2 seconds
still no cat

Conclusion

All the scenarios with neural networks with Tensorflow and OpenCV-DNN were replaced by NCS.

That means Tensorflow can be retired from the board. To be honest, this framework is too heavy for Raspberry.

The performance of NCS allows using neural networks more extensively. For example, stream video with object detection in realtime or apply an instance segmentation.

The single-process restriction is limiting but there is a way to wrap NCS by a dedicated service.