Beyond Object Identification: A Giant-Leap into Pattern Discovery in Imagery Data

A short and sweet tutorial on discovering correlations between the objects in imagery data

Uday Kiran RAGE
Towards Data Science

--

A critical question that arises after identifying the objects (or class labels) in an imagery database is: “How are the various objects discovered in an imagery database correlated with one another?” This article tries to answer this question by providing a generic framework that can facilitate the readers to discover hidden correlations between objects in the imagery database. (The purpose of this article is to encourage upcoming researchers to publish quality research papers in top conferences and journals. The portion of this article is drawn from our work published in IEEE BIGDATA 2021 [1].)

The framework to discover the correlation between the objects in an imagery database is shown in Figure 1. It involves the following three steps:

  1. Extract the objects (or class labels) and their probability scores for each image in the repository. The users can extract objects using object detection/instance segmentation/semantic segmentation techniques.
  2. Transform the objects and their probability scores into a database of your choice. (If necessary, prune uninteresting objects having low probability scores to reduce noise.)
  3. Depending on the database generated and the needed knowledge, apply the corresponding pattern mining technique to discover exciting correlations between the objects in the imagery data.
Figure 1: Framework to discover interesting patterns in imagery data

Demonstration: In this demo, we first pass the image data into a trained model (e.g., resnet50) and extract objects and their scores. Next, the extracted data is transformed into a transactional database. Finally, we perform (maximal) frequent pattern mining on the generated transactional database to discover frequently occurring sets of objects in the imagery data. Figure 2 shows the overview of our demo.

Figure 2: Overview of discovering patterns in Imagery data

Pre-requisite:

  1. We assume the readers are familiar with the instance/semantic segmentation and pattern mining topics. We recommend Phillipe’s video lectures on pattern mining.
  2. Install the following python packages: pip install pami torchvision
  3. Download the imagery database from [2]

(Please install any additional packages needed depending on your computing environment.)

Step 1: Extracting Objects and Their Scores from Imagery Data

Step 1.1: Load pre-trained object detection model

Save the below code as objectDetection.py. This code accepts the imagery folder as input, implements the pre-trained resnet50 model, and outputs a list (i.e., self.predicted_classes) containing class labels and their scores. Each element in this list represents the class labels found in an image.

import glob
import os
import csv
import torchvision
from torchvision import transforms
import torch
from torch import no_grad
import cv2
from PIL import Image
import numpy as np
import sys
import matplotlib.pyplot as plt
from IPython.display import Image as Imagedisplay
from PAMI.extras.imageProcessing import imagery2Databases as ob
class objectDetection:
def __init__(self):
self.model_ = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained=True)
self.model_.eval()
for name, param in self.model_.named_parameters():
param.requires_grad = False
def model(self, x):
with torch.no_grad():
self.y_hat = self.model_(x)
return self.y_hat
def model_train(self, image_path):
# label names
self.coco_instance_category_names = [
'__background__', 'person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus',
'train', 'truck', 'boat', 'traffic light', 'fire hydrant', 'N/A', 'stop sign',
'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow',
'elephant', 'bear', 'zebra', 'giraffe', 'N/A', 'backpack', 'umbrella', 'N/A', 'N/A',
'handbag', 'tie', 'suitcase', 'frisbee', 'skis', 'snowboard', 'sports ball',
'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard', 'tennis racket',
'bottle', 'N/A', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl',
'banana', 'apple', 'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza',
'donut', 'cake', 'chair', 'couch', 'potted plant', 'bed', 'N/A', 'dining table',
'N/A', 'N/A', 'toilet', 'N/A', 'tv', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone',
'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'N/A', 'book',
'clock', 'vase', 'scissors', 'teddy bear', 'hair drier', 'toothbrush'
]
self.transform = transforms.Compose([transforms.ToTensor()])
self.image_path = image_path
self.image = Image.open(self.image_path)
# resize and plotting the image
self.image.resize([int(0.5 * s) for s in self.image.size])
del self.image_path
self.image = self.transform(self.image)
# predictions without any threshold
self.predict = self.model([self.image])
self.predicted_classes = [(self.coco_instance_category_names[i], p) for
i, p in
zip(list(self.predict[0]['labels'].numpy()),
self.predict[0]['scores'].detach().numpy())]
return self.predicted_classes

Step 1.2: Detecting objects from each image

The below code identifies various objects in each image and appends them to a list called detected_objects_list. This list will be transformed into a transactional database in the next step.

from PAMI.extras.imageProcessing import imagery2Databases as ob
# input images path folder
images_path = 'aizu_dataset'
# list to store output items
detected_objects_list = []
# opening the images folder and reading each image
for filename in glob.glob(os.path.join(images_path,'*.JPG')):
with open(os.path.join(os.getcwd(),filename),'r') as f:

# loading pretrained resnet-50 model to train on our dataset
model_predict = objectDetection()

# input each image to the pre-trained model
# model returns detected objects
objects_detected = model_predict.model_train(filename)
detected_objects_list.append(objects_detected)

Step 2: Creating a transactional database

Prune the uninteresting class labels using the below code. Save the remaining data as a transactional database.

#Prune uninteresting objects whose probability score is less than a particular value, say 0.2
obj2db = ob.createDatabase(detected_objects_list,0.2)
#save the objects identified in the images as a transactional database
obj2db.saveAsTransactionalDB('aizu_dataset0.2.txt',',')

View the generated transactional database file by typing the following command:

!head -10 aizu_dataset0.2.txt

The output will be as follows:

motorcycle,backpack,person
book,baseball bat,refrigerator,cup,toaster
bottle,bowl,tv,toilet,chair,mouse,refrigerator,cell phone,microwave,remote,sink
microwave,refrigerator,bowl,bottle,cell phone,oven,car,person
bench
potted plant
bottle,handbag,suitcase,book
book,laptop,tv,umbrella
oven
parking meter,car

Step 3: Extract patterns in the transactional database.

Apply maximal frequent pattern-growth algorithm on the generated transactional database to discover the hidden patterns. In the below code, we are finding patterns (i.e., the sets of class labels) that have occurred at least ten times in the imagery database.

from PAMI.frequentPattern.maximal import MaxFPGrowth as algobj = alg.MaxFPGrowth('aizu_dataset0.2.txt',10, ',')
obj.startMine()
print(obj.getPatterns())
obj.savePatterns('aizuDatasetPatterns.txt')
print('Runtime: ' + str(obj.getRuntime()))
print('Memory: ' + str(obj.getMemoryRSS()))

View the generated patterns by typing the following command:
!head -10 aizuDatasetPatterns.txt

The output will be as follows:

refrigerator	microwave	:11
toilet :10
cell phone :11
traffic light :12
truck :12
potted plant :12
clock :15
bench :17
oven :17
car :18

The first pattern/line says the 11 images in the image repository contained the class-labels refrigerator and microwave. Similar statements can be made with remaining patterns/lines.

Knowing the correlation between the different objects/class labels may benefit the users for decision-making purposes.

Conclusion:

Efficient identification of objects in imagery data has been widely studied in industry and academia. A key question after identifying the objects is, what is the underlying correlation between the various objects in the imagery data? This blog tries to answer this crucial question by providing a generic methodology for transforming the discovered objects in the image data into a transactional database, applying pattern mining techniques, and discovering exciting patterns.

Disclaimer:

  1. All images displayed on this page were drawn by the author.
  2. The imagery database was created by the author himself and open-sourced to be used for both commercial and non-commercial purposes.

References:

[1] Tuan-Vinh La, Minh-Son Dao, Kazuki Tejima, Rage Uday Kiran, Koji Zettsu: Improving the Awareness of Sustainable Smart Cities by Analyzing Lifelog Images and IoT Air Pollution Data. IEEE BigData 2021: 3589–3594

[2] Imagery dataset: aizu_dataset.zip

--

--

AssociateProfessor@UoAizu. Published over 80 papers in top CS confereces, such as PAKDD, EDBT, CIKM, SSDBM, IEEE FUZZY, IEEE BIGDATA, DASFAA, and DEXA.