Introduction
Social Distancing – the term that has taken the whole world by storm and is transforming the way we live. Social distancing also called "physical distancing", means keeping a safe space between yourself and other people who are not from your household. As the country has started to unlock amid surging COVID-19 cases, maintaining social distancing has become a key issue. The biggest concern revolving around the COVID-19 situation is how quickly the infection spreads from one person to another through contact or even being within proximity of an infected person. Social Distancing is here to stay longer than expected to fight Covid-19.
So this got me thinking to develop an AI model to detect if people are following social distancing and at the same time are wearing masks. So here is a sample of the outcome of the model. With the use of deep learning and OpenCV, we can extract interesting insights from video clips. Red bounding boxes indicate that the person is in the proximity of another person and the blue box indicates the person is maintaining social distance. And we have separate bounding boxes for identifying if the person is wearing a mask or not.

You can find the code I used on my Github Repo.
Overview of the steps
The TensorFlow object detection API is the framework for creating a deep learning network that solves object detection problems. The API provides pre-trained object detection models which they refer to as Model Zoo, have been trained on the COCO dataset. The Common Objects in Context (COCO) dataset has 200,000 images with more than 500,000 object annotations in 90 commonly found objects. See the image below of objects that are part of the COCO dataset.

In this case, we care about class ‘Person’ which is part of the COCO dataset.
The API also has a big set of models it supports. See the table below for reference.

The models have a trade-off between speed and accuracy. Here have I chosen ssd_mobilenet_v1_coco to detect ‘Person’. Once we identify the ‘Person’ using the object detection API, to predict if the person is maintaining social distance we can use OpenCV which is a powerful library for image processing. Once social distancing is detected, I have used faster_rcnn_inception_v2_coco API which I had previously trained on top 2000 images using a GPU(NVIDIA Quadro P4000 Linux) **** to detect if the person is wearing a mask or not.
Deep Dive into the main steps
Now let’s go into the code in detail.
If you are using the Tensorflow Object Detection API for the first time, please download the GitHub from this link.
The main steps I followed are (please follow along in the jupyter notebook on my Github)
- Load the ssd_mobilenet_v1_coco model into a graph and load the list of classes that are part of COCO dataset
- Open the video using cv2.VideoCapture(filename) and read each frame one by one and write it in a path
- For each frame perform object detection using the loaded graph
- The result that comes back from the ssd_mobilenet_v1_coco is each identified class along with its confidence score and bounding box prediction.
-
Based on the class and confidence score > 0.60, detect the number of persons in a frame as below.
-
Draw blue colored bounding box based on the bounding box prediction which we got previously and find the midpoint of the width. Mark each bounding box with an id.
-
Find euclidean distance between midpoints in a frame.
- Having the euclidean distance, find the bounding boxes which have distance below 200 and change the color of the bounding box to red.
- Put all the code pieces together and pass all the frames through and save them in a path. Thus, we will have a set of social distance detected frames
- Load the frozen_inference_graph.pb (faster_rcnn_inception_v2_coco) which I trained on top with mask and without mask images into a graph and load the list of classes
- For each social distance detected frame perform object detection of people wearing masks using the loaded graph of faster_rcnn model
- Finally using the frames obtained previously, create a video using the available moviepy package in python
Conclusion and improvements
This brings us to the end of my article. Social distancing along with other basic sanitary measures are very important to keep the spread of the Covid-19 as slow as possible. This project is only a proof of concept.
I am well aware that this project is not perfect. So there are a few ideas on how this application can be improved :
- I have come across a few approaches where people convert a video into a top view or birds’ eye view and then compute the distance between two objects in an image
- Bring Camera Calibration into consideration
"Home is a shelter from storms – all sorts of storms. Stay safe at home."