The world’s leading publication for data science, AI, and ML professionals.

Part 1 – Initial Setup + Depth (OpenCV Spatial AI Competition Journey)

Journey on the development of a social distancing feedback system for the blind as part of the OpenCV Spatial Competition

Photo by Robert Norton on Unsplash
Photo by Robert Norton on Unsplash

Update: This article is part of a series where I will be documenting my journey on the development of a social distancing feedback system for the blind as part of the OpenCV Spatial Competition. Check out the full series: Part 1, Part 2.

OpenCV Spatial AI Competition

Recently, the people at OpenCV launched the OpenCV Spatial AI competition sponsored by Intel as part of OpenCV’s 20th anniversary celebration. The main objective of the competition is to develop applications that benefit from the features of the new OpenCV AI Kit with Depth (OAK-D). The competition consists of two phases, the winners of the first phase were able to obtain an OAK-D for free to develop their application and the second phase winners will receive a cash prize of up to $3,000.

The OAK-D contains a 12 MP RGB camera for deep neural inference and a stereo camera for depth estimation in real time using Intel’s Myriad X Vision Processing Unit (VPU).

If you want to know more about the OAK-D, make sure to check the interview by Ritesh Kanjee to Brandon Gilles, __ who is the Chief Architect of the OpenCV AI Kit. ** The kit has raised over $800,000 as part of their Kickstarter capai**gn with mote than 4,000 supporters. If you are interested, you can also find more about the kit in Luxoni’s community slack channel (https://luxonis-community.slack.com/)

Due to the interesting features that the OAK-D, I decided to apply for the OpenCV Spatial AI competition and was lucky enough to be selected as one of the winners of Phase 1. You can also check the projects for the rest of the Phase 1 winners here.

This publication is part of a series of post where I will be describing my journey developing with the new OAK-D as part of my competition project.

Proposed System

Illustration of how the output of the proposed system could detect people wearing a mask and their distance to the user.
Illustration of how the output of the proposed system could detect people wearing a mask and their distance to the user.

The title of my proposal is "Social distancing feedback for visually impaired people using a wearable camera". Due to the current worldwide outbreak of COVID-19, social distancing has become a new social norm as a measure to prevent the widespread of the pandemic.

However, visually impaired people are struggling to keep independence in the new socially distanced normal¹,². For blind people, it is not possible to easily confirm if they are keeping the social distance with the people around them. As an example, a video in the Royal National Institute of Blind People (RNIB) Twitter account showed the difficulties blind people are struggling with in their daily life due to social distancing.

Moreover, common solutions for the blind such as white cane or dog cannot assist the blind to keep the social distance. Even worse, blind people cannot know if the people close to them is wearing a mask or not, thus they suffer a higher risk of infection.

For those reasons, the objective of my project is to develop a feedback system for the blind that informs about the distance to other people around and whether someone is not wearing a mask.

For that type of project, where the depth and Artificial Intelligence needs to be combined in real time, the OAK-D is the ideal system. As shown in one example of the DepthAI experiments, the OAK-D is able to detect in real time the position of the faces in an image and whether they are wearing a mask or not. By combining this information with the depth information obtained from the stereo cameras, it is possible to estimate the position of the people around the user and if someone is not wearing a mask.

Then, the system will inform the user about the distance to the people around using five haptic motors attached to the OAK-D board. The haptic motors will be related to 5 direction angles: -40, -20, 0, 20 and 40 degrees. For example, if the system detects that there is a person near at an angle of -20 degrees (as in the image above), then the second motor from the left will vibrate. Moreover, in order to inform about how close the person is, the intensity of the motor will change as the detected person gets closers. Finally, if the system detect that there is a person not wearing a mask, the system will inform the user by changing the vibration pattern.


Windows Setup and Initial Testing

This week I received the OpenCV AI Kit. As shown in the image below, the kit contains a OAK-D, a USB-C cable, a 5V (3A) wall charger and a 3D printed GoPro mount.

Components of the OAK-D. Note: The Raspberry Pi Zero was not included with the kit, it was added only for comparing dimensions.
Components of the OAK-D. Note: The Raspberry Pi Zero was not included with the kit, it was added only for comparing dimensions.

The OAK-D has a small size (46 x 100 mm) with a T shape. Actually, the lower part of the board matches with the width of the Raspberry Pi Zero, so the system combining both boards can have a compact size as shown in the image below.

In order to connect with the OAK-D, the people at Luxonis have developed the DepthAI Python API. The DepthAI API is open source and can run on different Operating Systems. Check the official installation site for instructions on how to install it: https://docs.luxonis.com/projects/api/en/latest/install/

i̶n̶c̶l̶u̶d̶i̶n̶g̶ ̶U̶b̶u̶n̶t̶u̶,̶ ̶R̶a̶s̶p̶b̶i̶a̶n̶ ̶a̶n̶d̶ ̶m̶a̶c̶O̶S̶.̶ ̶I̶n̶ ̶t̶h̶e̶ ̶c̶a̶s̶e̶ ̶o̶f̶ ̶W̶i̶n̶d̶o̶w̶s̶ ̶1̶0̶,̶ ̶a̶s̶ ̶o̶f̶ ̶t̶o̶d̶a̶y̶ ̶(̶A̶u̶g̶u̶s̶t̶ ̶8̶,̶ ̶2̶0̶2̶0̶)̶ ̶i̶t̶ ̶i̶s̶ ̶s̶t̶i̶l̶l̶ ̶e̶x̶p̶e̶r̶i̶m̶e̶n̶t̶a̶l̶.̶ ̶H̶o̶w̶e̶v̶e̶r̶,̶ ̶f̶o̶l̶l̶o̶w̶i̶n̶g̶ ̶t̶h̶e̶ ̶i̶n̶s̶t̶r̶u̶c̶t̶i̶o̶n̶s̶ ̶d̶e̶s̶c̶i̶b̶e̶d̶ ̶i̶n̶ ̶h̶e̶r̶e̶,̶ ̶t̶h̶e̶ ̶p̶r̶o̶c̶e̶s̶s̶ ̶w̶a̶s̶ ̶q̶u̶i̶t̶e̶ ̶e̶a̶s̶y̶.̶ ̶O̶n̶e̶ ̶i̶m̶p̶o̶r̶t̶a̶n̶t̶ ̶n̶o̶t̶e̶,̶ ̶i̶f̶ ̶y̶o̶u̶ ̶d̶o̶ ̶n̶o̶t̶ ̶w̶a̶n̶t̶ ̶t̶o̶ ̶h̶a̶v̶e̶ ̶t̶o̶ ̶c̶o̶m̶p̶i̶l̶e̶ ̶t̶h̶e̶ ̶A̶P̶I̶,̶ ̶m̶a̶k̶e̶ ̶s̶u̶r̶e̶ ̶t̶o̶ ̶u̶s̶e̶ ̶t̶h̶e̶ ̶P̶y̶t̶h̶o̶n̶ ̶3̶.̶7̶ ̶(̶3̶2̶ ̶b̶i̶t̶)̶.̶ ̶I̶ ̶t̶r̶i̶e̶d̶ ̶t̶o̶ ̶u̶s̶e̶ ̶t̶h̶e̶ ̶P̶y̶t̶h̶o̶n̶ ̶3̶.̶8̶ ̶(̶3̶2̶ ̶b̶i̶t̶)̶ ̶b̶u̶t̶ ̶i̶t̶ ̶d̶i̶d̶ ̶n̶o̶t̶ ̶w̶o̶r̶k̶ ̶c̶o̶r̶r̶e̶c̶t̶l̶y̶,̶ ̶s̶o̶ ̶m̶a̶k̶e̶ ̶s̶u̶r̶e̶ ̶y̶o̶u̶ ̶a̶r̶e̶ ̶u̶s̶i̶n̶g̶ ̶t̶h̶e̶ ̶c̶o̶r̶r̶e̶c̶t̶ ̶P̶y̶t̶h̶o̶n̶ ̶v̶e̶r̶s̶i̶o̶n̶.̶

Once I installed the depthAI API and its dependencies, I was able to run the default demo by running the following command (make sure to be in the depthai folder):

python depthai.py

This demo by default runs the MobileNet SSD object detection model that can detect 20 different types of objects (bicycle, car, cat…) inside an image. Moreover, the demo combines the bounding box of the detected object with the depth information of the stereo cameras to provide the 3D position of each detected object. As an example, below, I show an output of the demo for the detection of a water bottle, which is one of the classes that can detect the default demo model.

The demo was able to track the object and calculate the depth at 30 fps without any problem. By looking at the Python code of the depthai.py script, I saw that the demo can be configured to other modes by add arguments when running the demo. For example, running the following code it is possible to obtain the colorized depth (Note: only works for the Refactory version for Windows 10, in the original repository the configuration has changed):

python depthai.py --streams depth_color_h
Depth output using the OAK-D.
Depth output using the OAK-D.

Overall, the depth looks pretty good with some black regions on the left of the background. However, that region contains glass panels and probably the stereo camera system cannot extract many features from it, so that why no depth was provided for those regions.


Depth Estimation: OAK-D vs. Azure Kinect DK

Even though the depth estimation of the OAK-D is not its main feature, I wanted to compare the depth estimation of the OAK-D with the latest Azure Kinect DK. For that purpose, I modified wrote a small Python script (hello_depth.py) that reads the raw deoth values and displays the depth as in the Azure Kinect.

As for the Azure Kinect, I used the depth estimation example program in my Python repository for the Azure Kinect SDK. In the image below the estimated depth for both devices is compared.

Comparison of the depth estimation for the Azure Kinect and the OAK-D.
Comparison of the depth estimation for the Azure Kinect and the OAK-D.

As it can be observed, even though the OAK-D uses a estereo camera the results are very good. Particularly, in the case of the table, the OAK-D was able to estimate the depth correctly whereas the TOF sensor in the Azure Kinect failed.


This is all for this first part, in the next part I will test the face mask detection example using the OAK-D. I will also be uploading all the scripts for this project in my repository https://github.com/ibaiGorordo/Social-Distance-Feedback.


References

[1]: Gus Alexiou. (June 7 2020). Blind People’s Social Distancing Nightmare To Intensify As Lockdowns Ease, https://medium.com/@arinbasu/you-can-use-footnotes-thus-babus%C2%B9-6c485c4eff1e

[2]: BBC news. (June 24 2020). Social distancing a ‘struggle’ for those visually impaired, https://medium.com/@arinbasu/you-can-use-footnotes-thus-babus%C2%B9-6c485c4eff1e


Related Articles