What does computer vision see in the 2020-US Election news feed? (Part 1/2)

Part1 -Generating batch AI annotated images and pre-processing data to build feature-level image datasets.

Srinivas Vadrevu

Published in

Towards Data Science

12 min readNov 2, 2020

Data visualization grid generated from data annotated on a sample batch of US Election 2020 images; Background Photo by Dan Dennis on Unsplash; Analysis by Srinivas Vadrevu

Edit:

This is the first article of a two-part series. You can find the link to part 2 here

Visual content strategy is now an integral part of marketing due to both the prevalence and importance of images in the customer purchase journeys. Yet, visual content creation is mainly rooted in tacit knowledge and creative domains. However, with the advent of computer vision, should the companies operationalize AI to build on explicit knowledge of image features to explore what drives impressions and influence purchase decisions? In other words, is the time ripe for AI-generated data to complement the creative process? With US elections around the corner, I choose it as a current marketing context to use Cloud Vision AI and explain how to build extensive feature level datasets in the first part of this article.

I am sure you must have heard of the adage...

“A picture is worth a thousand words”

Fig1: A blurred version of the early ads highlighting image importance, by Frank Barnard in 1921; Source: http://www2.cs.uregina.ca/~hepting/projects/pictures-worth/

Did you know that this English adage traces its origins to advertising? Although it is originally an Asian proverb, a few early advertisers have used it to underscore the importance of a picture in generating impressions.

The ad by Frank Barnard in 1921 (see the image above) puts it this way:

“Buttersweet is Good to Eat” is a very short phrase but it will sell more goods if presented, with an appetizing picture of the product, to many people morning, noon and night, every day in the year than a thousand word advertisement placed before the same number of people only a limited number of times during the year ….
It is simply the preponderance of favorable impressions for a meritorious product that reminds the consumer to buy it again and again”— Frank Barnard

A Picture's Worth | D. H. Hepting, Ph.D.

What does the phrase "A picture is worth ten thousand words" mean to you? Does it become more or less meaningful if it…

www2.cs.uregina.ca

While it is a bit ironic that you do not see any pictures in the above ad, it highlights that the images through their media (on the cars) play an important role in leaving favorable impressions with viewers. These impressions could potentially influence their decision to purchase the products. Yes, this was when the word ‘impression’ was more than a marketing KPI metric.

As you fast forward to the current day from 1921, you find yourself accelerating through photos, billboards, TV ads, ads on the internet, memes, display ads, Instagram feed & other social media, etc. After a dizzying ride, you see that you are scrolling through a few posts on Twitter as part of your daily routine. Frank Barnard’s cars as a channel for acquiring users, and their impressions are now replaced with social media feed. You come across a post in your feed (see below) and consider the new product for five seconds before scrolling down. You are left with an ‘impression’ about the earbuds, which might shape your decision to engage with the product in the future. Little did Frank Barnard know that his take on advertising images would be still relevant after 100 years.

A recent survey underscores the importance and prevalence of visual imagery in advertising through its findings:

60% of consumers say they’re more likely to consider or contact a business that has an image show up in local search results
67% of consumers say that a product image's quality is paramount in selecting and purchasing the product.

Motivation and context:

With ~2 out of every 3 customers relying on visual creatives for their purchase decision, the importance of visual creatives in marketing cannot be emphasized more. However, from my experiences with a few startups recently and from reading quite a few materials, I thought that the visual content strategy involved mainly (1) Audience and their goals, (2) Brand image and its goals, and (3) Distribution media. However, the actual image content and composition was mainly a creative effort and an outcome built on the marketers' tacit knowledge. So the motivation behind this article was to explore possibilities using machine learning/computer vision to analyze the images and check if AI can generate explicit knowledge about the content strategy to complement the creative effort of creating visual content. In my view, AI image analytics could open a lot more possibilities to visual content marketing, as did sabermetrics to the world of baseball in the 2000s.

Pinned Tweet by presidential candidate

Setting the context: With the presidential campaigns in full swing and elections around the corner, I could not think of a more current and relevant marketing context where the presidential candidates are proactively campaigning (read marketing) themselves to the American populace to vote (read.

purchase) for in the upcoming elections on Nov3, 2020. A glance at both articles listed below highlights the role of images/pictures( read visual media) in influencing the elections (read purchase decision). Also, due to COVID-19, the role of images has only been amplified. I know it’s a stretch, but we do seem to have parallels between political campaigns and company marketing efforts, especially on the role of imagery.

How Photography Plays a Crucial Role in Presidential Elections

The photograph shows President Barack Obama with his wife, First Lady Michelle Obama, waving during one of the…

time.com

Six ways the media influence elections

Story by Andra Brichacek. Video by Ryan Lund and Aaron Nelson. Photos by Schaeffer Bonner and Karly DeWees. Ask Donald…

journalism.uoregon.edu.

After setting the context, I decided to analyze the visual content on the current theme US Election Campaigns 2020 across different sources such as Google Images, newspaper websites, and news API to operationalize computer vision for image analysis. I divided the main article into three sections-

Aggregating a set of images around a given theme ( US Election Campaigns 2020 in this article)- Part 1a
Batch Processing the images through Vision AI for annotating features for each image- Part 1b
Visualizing the features for insights- Part 2

In this article, I will cover the first two sections in detail along with the code to generate your AI annotated feature datasets from a batch of images on the internet. In part 2, I will discuss more on charting these datasets and exploring insights from them.

Part 1a: Data Collection- Aggregating a set of images around a topic — US Elections 2020

I wanted to build a personal image dataset for collecting the photos about the US Elections 2020 in the last couple of weeks. For this, I identified different sources for creating an image database, each picture of which will be annotated by ML. I have listed three sources, the code, and approach for collecting the URLs from these sources-

a) NEWS API —

This is an easy to use HTTPS REST API where you can request “everything” about the articles generated for a query. The ‘everything’ endpoint for your article provides you with :

Status, Total results, articles, Source, Author, Title, Description, URL, urlToImage < the link to the image in the article>, published and Content <entire text of the article>

You can find the code below (Gist 1) to call NewsAPI and push the data into a pandas data-frame for easier readability.

Gist 1: Code for pulling news articles from NewsAPI into a pandas data-frame

In the gist, I keyed in the URL parameters and collected the response from NewsAPI in a JSON format. From the JSON file, I select the articles and loop it to append the articles' attributes, including the image URL.

b) Google Image Search- The other source of images around a topic is Google Images. It is a single touchpoint for anyone looking for images related to a topic. While looking up for approaches to source Google Images into the dataset, I came across this article by @fabianbosler. It’s definitely a good read for anyone looking to scrape images responsibly on the internet.

I used the gist code in that article to extract the images and the URLs around US Election news.

c) News websites — What could be a better source of images for US Elections 2020 news rather than news websites themselves? A lot of popular news websites are creating their own APIs for developers to access their data. You can check out their websites for API docs and endpoints. A couple of them are https://developer.nytimes.com/ and http://developer.cnn.com/(to be launched soon). If there is no API for the website, you can try the classic, beautiful soup package to scrape the website for the image URLs.

Say you want to scrape the abcdef.com website for all the articles between 18th October and 31st October. First, you might want to check the structure of the URL of the website to identify any date string that can be looped to generate the article URLs sequentially. For each URL, you can find all the images on the site, get its ‘src,’ append them into a data frame and repeat the iteration for all the article URLs you need (based on date).

Gist 2: Calling beautiful soup to extract the image URLs in the main article URL

Before you scrape anything off the internet for personal or commercial uses, it is generally considered good practice to go through the terms of use/service to check if you violate any terms.

Using a combination of these approaches, I collected the images from a few popular websites covering US Election 2020 news to build an extensive image dataset for analyzing the trends. The list of image URLs is stored in links_df. The unique URLs can be extracted as a list for annotation. There could be cases where the same URL could be extracted across different sources, as the same image from a news website appears in Google search images.

url_list= links_df['urls'].unique().tolist()

Step1b: Annotate the collected images at a batch level using Cloud Vision API and convert the JSON responses into data frames

With the image URL dataset ready, I used Cloud Vision API for annotating various features through their REST API. For this, I created a GCP project, enabled Cloud Vision API, created a service account, and generated the private credentials in a JSON format for authentication. Then, I created a virtual environment using the Terminal to install the Google Cloud vision library.

virtualenv <your-env>
source <your-env>/bin/activate
<your-env>/bin/pip install google-cloud-vision

If you also want to create a new kernel for the project, you can use the ipykernel package:

pip install ipykernel
ipython kernel install --user-- --name=yourkernelname
jupyter notebook

For batch processing, I wanted to build a function “runvisionai” that generates a feature category dataset for the batch of URLs input into the function. Before that, I had to decide the structure of each feature category level dataset. As an example, I will walk you through creating facial-features-dataset for an image from Unsplash. ( see Fig 4 below)

client = vision.ImageAnnotatorClient()
image = vision.Image()url= "https://images.unsplash.com/photo-1540502040615-df7f25a5b557?ixlib=rb-1.2.1&ixid=eyJhcHBfaWQiOjEyMDd9&auto=format&fit=crop&w=2550&q=80"image.source.image_uri = url response_face_example = client.face_detection(image=image)

Once you know the data points in the “face_annotations” of the response JSON file, you need to choose the granularity level for appending the feature category data frame.

You can find the JSON response here for the image after sending it to vision AI for processing. If that response is visualized, it will look like the processed image below. From the JSON response, there are possible levels of granularity:

Fig 5: Visual representation of the face_detection response by Vision AI

In the JSON file, you can condense these data points at a face level by appending the below data points for each face (which I did in this project).

Face 1: 
roll_angle: -9.5155668258667
  pan_angle: -5.019717216491699
  tilt_angle: 1.8756755590438843
  detection_confidence: 0.9624646902084351
  landmarking_confidence: 0.6258678436279297
  joy_likelihood: LIKELY
  sorrow_likelihood: VERY_UNLIKELY
  anger_likelihood: VERY_UNLIKELY
  surprise_likelihood: VERY_UNLIKELY
  under_exposed_likelihood: VERY_UNLIKELY
  blurred_likelihood: VERY_UNLIKELY
  headwear_likelihood: VERY_UNLIKELY

Fig 6: My sketch of a data frame for collecting bounding polygon vertices for each face

2. Or… you could collect the data at a vertex level for each face, i.e., for each face, you collect the x and y coordinates of every vertex of the polygon bounding the face.

In this case, you might want to create a nested loop within the face loop in the “runvisionai” function( which is, in turn, nested within the URL). I find it useful for someone who started to code recently to draw the dataset to outline and reverse engineer the code to arrive at it.

Gist 3: Code to extract the bounding vertices of the faces annotated in the API response

Fig 6: Outcome of the code in Gist 3; Its a data frame where the unit is the vertex of the bounding polygon; Analysis by Srinivas Vadrevu

3. Or… you could create a facial feature dataset where the level of granularity is fixed at the facial feature/face landmark level. The unit item is the physical facial feature. Then, in this case, my code schema would be

Fig 7: My sketch of a data frame to collect the facial landmarks and their coordinates

(1) Loop URL → Call RunVisionAI → Collect face. annotation JSON for each URL

(2)Then loop Face in the face. annotation inside the function→ collect face id in the dataset→

(3) Create a nested loop landmark for each face → collect face landmark in the dataset→

(4) Append the x,y, and z coordinates at the facial landmark level. Repeat till all loops are completed.

Gist 4: Code to extract the vertices of the facial landmarks annotated in an image

After deciding on the granularity of data at the face level and structure of the data frames, I created an empty data frame for each feature category to store and append the results in a data frame from each URL iteration. I created a face_df data frame for the face data set to collect the features — confidence score, joy, sorrow, surprise, anger, and blurred against each face.

face_df=pd.DataFrame(columns['Source','Query/alt','URL','img_num',
                    'face','confidence','joy','sorrow','surprise',
                    'anger','blurred'])

I set the identifiers for each image across all feature-data-frames to be: Source ( Google Image Search/ News website name from NewsAPI/ News website), Query/alt ( contains the search query for Google Image search or the alt description of the image), and URL ( the URL link for that image)

The logic remains very similar to other feature categories. You can find the entire code for all feature categories below in Gist 5. This function ‘runvisionai’, successfully annotates a batch of URLs and stores the data in the respective data frames for different feature categories.

Gist 5: Code for RunvisionAI function to build feature level datasets for URL that is input into it

Go ahead and run the function to arrive at a feature category level dataset for the batch of URLs you collected in Step1.

urls= list_df['URLs'].unique().tolist()
runvisionai(urls)print(label_df)
print(objects_df) # check if feature category dataset is prepared

If everything panned well by the end of Step 1b, you would have all the filled data frames with different units ( URL, face, location, label, logo, etc.) and corresponding metrics for each feature category.

Fig8: Output data frames for the feature categories; Analysis by Srinivas Vadrevu

Step2: Visualisation

In the next article, I will plot and analyze the images in light of the features extracted by Vision AI. I will update this article with a link to the Part2 item soon. As a preview to the next article, I have charted the labels annotated for the images published in the US Election coverage by CNN and NYT in the last three-day window (29th Oct — 31st Oct). The y-axis has the labels arranged in descending order of frequency of appearance in the images. The x-axis represents the count/frequency. I have only considered labels identified in an image with more than a 90% confidence level.

Using the image below, you can compare and contrast the labels annotated in the images published on CNN and NYTimes websites in the time frame 29th Oct-31st October 2020

Fig 9: Top labels( identified with more than 90% confidence in the NYT images) arranged by the count of appearances; Analysis by Srinivas Vadrevu

Fig 10: Top labels( identified with more than 90% confidence in the CNN images) arranged by the count of appearances; Analysis by Srinivas Vadrevu

Conclusion for Part 1of this article: This article's main objective is to outline the code and steps involved in generating an extensive feature dataset from a batch of images. I have used US elections 2020 as a marketing context to extract the features from the internet's images. In my view, the image feature components could be potentially used as dependent variables for explaining the visual content campaign performance. In other words, it would be quite interesting to link these image data sets to campaign marketing analytics data (impressions, likes, clicks, CTR, etc.). Upon doing so, I surmise we could find a certain combination of poses, labels, facial emotions, objects, color combinations, and logos have better performance, say click-through-rates vis-a-vis’ other combinations. This explicit knowledge can be feed forwarded to the creative teams involved in visual content generation.