Google vision API for image analysis with python

Srujana Takkallapally
Towards Data Science
3 min readNov 12, 2019

--

Google Vision API detects objects, faces, printed and handwritten text from images using pre-trained machine learning models. You can upload each image to the tool and get its contents. But, if you have a large set of images on your local desktop then using python to send requests to the API is much feasible.

This article talks about how to create, upload images to google bucket, perform label detection on a large dataset of images using python and google cloud sdk. “gsutil” is used for fast upload of images and set lifecycle on google bucket. All images were analyzed with batch processing.

Step 1: Create a project

Follow the steps in the link below to create a new project and enable google vision AI. Store the key in a JSON file.

Step 2: Download google cloud sdk along with gsutil

Gsutil tool helps easy upload of large dataset of images to a google bucket. Run the following code in command prompt or terminal

curl https://sdk.cloud.google.com | bash

Else you can also download sdk from following links

Mac OS: https://cloud.google.com/sdk/docs/quickstart-macos (store the folder in home directory)

Windows https://cloud.google.com/sdk/docs/quickstart-windows

Step 3: Set configuration:

Following commands are required to connect to your google cloud project created in step 1. Type this in terminal

gcloud init

Pick configuration to use: select “Create a new configuration”

Choose an account to perform operations: If you don’t see your gmail account select “Log in with a new account” and login to the account.

Pick cloud project to use: You should see the project you created in step 1 and select it

Step 4: Upload images to google cloud storage

Create a bucket: gsutil mb ‘gs://bucketname’ (bucket name should be unique)

Upload image folder from your local desktop to google bucket:

gsutil -m cp -R ‘path/to/imagefolder’ ‘gs://bucketname’

Step 5: Get labels for images in google bucket

Now that you have all the images in the bucket get labels using ‘ImageAnnotatorClient’. If you have a lot of images then iterating through every image in the bucket will be time consuming. Batch processing can speed up this process with a maximum limit of 16 images per batch (https://cloud.google.com/vision/quotas).

#install google cloud vision
pip install google-cloud-vision
#import dependencies
from google.cloud import vision
from google.cloud import storage
from google.cloud.vision_v1 import enums
from google.cloud.vision_v1 import ImageAnnotatorClient
from google.cloud.vision_v1 import types
import os
import json
os.environ["GOOGLE_APPLICATION_CREDENTIALS"]='project_key.json'
#(created in step 1)
# Get GCS bucket
storage_client = storage.Client()
bucket = storage_client.bucket('bucket_name’)
image_paths = []
for blob in list(bucket.list_blobs()):
image_paths.append("gs:// bucket_name/"+blob.name)
# We can send a maximum of 16 images per request.
start = 0
end = 16
label_output = []
for i in range(int(np.floor(len(image_paths)/16))+1):
requests = []
client = vision.ImageAnnotatorClient()
for image_path in image_paths[start:end]:
image = types.Image()
image.source.image_uri = image_path
requests.append({'image': image,'features': [{'type': vision_v1.Feature.Type.LABEL_DETECTION}]})
response = client.batch_annotate_images(requests)
for image_path, i in zip(image_paths[start:end], response.responses):
labels = [{label.description: label.score} for label in i.label_annotations]
labels = {k: v for d in labels for k, v in d.items()}
filename = os.path.basename(image_path)
l = {'filename': filename, 'labels': labels}
label_output.append(l)
start = start+16
end = end+16
#export results to JSON file
with open('image_results.json', 'w') as outputjson:
json.dump(label_output, outputjson, ensure_ascii=False)

Results from label detection can be stored in JSON file.

Step 6: Delete images from google bucket: You may want to delete the images once you are done with analysis as there will be storage costs. Deleting each image in a loop will take time. Instead set a lifecycle for the bucket so that you can delete whole bucket at once. Paste the following code in a JSON file and save it as lifecycle.json then execute the gsutil code

#Age tells about how many days after bucket creation you want to delete it.{
"rule":
[
{
"action": {"type": "Delete"},
"condition": {"age": 2}
}
]
}
#This codes sets lifecycle for the bucket
gsutil lifecycle set 'lifecycle.json' 'gs://bucket_name'

If you still have some questions or want to do text/face detection check out https://codelabs.developers.google.com/codelabs/cloud-vision-api-python/index.html?index=#0

Hope this article helps. Happy reading!

--

--