Why You Need Alternative Data and How to Use It

James Karam
Towards Data Science
4 min readApr 1, 2020

--

The days of relying on traditional datasets are gone

Analyzing light from satellite imagery
Photo by NASA on Unsplash

You are the owner of Alternative Tech, a chain of technology shops in the Kingdom of Saudi Arabia. So far, Alternative Tech has had success in Riyadh, mainly thanks to an understanding of the local market.

Your business partners used their knowledge of the local dynamics in order to customize the shops: staffing people who speak the right language, finding strategic locations, and stocking stores with items that match the interests of people in those areas.

Your next step? Expand to the city of Jeddah. The challenge? You have no understanding of the local dynamics, no partners to fill that gap, and no data to inform your decisions.

You see, lack of proper data is a big pain point for businesses generally, and particularly so in the Middle East region. However, I want you to view this as an opportunity.

Let us take a step back and look at the United States. The US Census Bureau conducts nation-wide censuses that result in, among other things, granular demographic data at a zip code level such as income, nationality, and age groups. Governments, public institutions, and private businesses alike make use of this data to inform a multitude of decisions including:

  • Deciding on the location and capacity of new housing and public facilities
  • Optimizing locations of telecommunication towers
  • Opening and customizing retail shops

US census data is far from optimal, facing issues such as outdated records and bias. Nevertheless, it serves as a good starting point.

This is not the case in the Middle East. Not only is traditional data lacking on many dimensions, it is also too broad, usually available at a district or regional level.

This is where alternative data comes in.

So what is alternative data? The name stems from finance, where investment companies started using non-traditional datasets about a company to extract unique and timely insights into the company’s performance from sources other than the regular company filings. For example, a hedge fund might use satellite imagery showing the parking lot of a retail store and use these images to forecast quarterly revenues and trade accordingly in the equity markets.

In fact, an estimated $1.1 Billion was poured into the alternative data industry in 2019, with an expected growth of 55% for 2020¹.

In a nutshell, alternative data provides granular, up-to-date information that can fill the gaps where traditional data is lacking (such as absence of proper census in the Middle East region), or overlay existing datasets with rich information from a different, and potentially proprietary, lens.

Now that we have established what alternative data is, let us go through an example to help explain how you would use alternative data to guide your decision making on your new technology shop in Jeddah.

  1. Identify the questions you need to answer. For the sake of brevity, let us say you have two pending questions: where should the shop be located, and what language should the salespeople speak?
  2. Translate the questions into dimensions that can be analyzed. Say the location of the shop mainly depends on the income level of the neighborhood, since Alternative Tech focuses on high-income level areas. Also, assume that language is driven mainly by the density of nationalities in the vicinity.
  3. Research data sources that would help answer those questions. First, start by researching official data, such as those that come from the national or local authorities. While this data might not provide the granular, up-to-date view that you want, it provides a baseline and helps you validate and triangulate any granular data that you come across. Then, explore alternative data. For income level, you might use data on residential property prices from online platforms as a proxy.
  4. Use data analytics to extract, clean, and aggregate alternative data. For property prices, you might parse (i.e., extract information from) a platform such as Airbnb or Property Finder to identify average price per sqm. in different areas and rank those neighborhoods into three buckets: low, medium, and high income levels.
  5. Validate your findings. This is the most important step. There is no single source of truth, rather a triangulation of different data sources to get granular, up-to-date information that makes business sense. To validate findings, you can use a combination of alternative data sources for the same dimension, or use previously collected district level data to vet your granular findings and fill any missing gaps.

Whether you are a decision maker in a business-to-consumer corporation, a policy maker, a consultant, or a program director in a public institution, you have probably learned first-hand that your insights are as good as the data that inform them. Consequently, having alternative data sources that complement your traditional datasets is extremely important for enriching your insights to decision making journey.

This is the first in a series of articles on the space of data analytics and visualization. In the following articles, I will further define data sources, highlight different use cases, tackle risks involved, and touch upon various aspects of popular tools used in data analytics and visualization.

[1] “Alternative Data. What Is It, Who Uses It and Why Is It Interesting?” Forbes, 12 Dec. 2019, https://www.forbes.com/sites/forbesinsights/2019/12/12/alternative-data-what-is-it-who-uses-it-and-why-is-it-interesting/#5024058c6123.

--

--

Management consultant focused on leveraging data analytics to solve business challenges. Linkedin: https://www.linkedin.com/in/jameskkaram/