Exploring venues in Chandigarh, India using Foursquare and Zomato API

Data Science Capstone Project

Karan Bhanot
Towards Data Science

--

Sukhna Lake, Chandigarh

As part of the IBM’s Applied Data Science Capstone course on Coursera.org, I worked on a Capstone project where I used the Foursquare API and Zomato API to fetch location, rating and price information of various venues in Chandigarh, India. In this article, I’ll discuss my approach of combining the data from both APIs and drawing meaningful information from the same.

Note that the maps might not directly be available in Github’s view of the notebook, so you can check them by cloning the repo and checking the maps folder.

Introduction

In this article, we’ll explore venues in Chandigarh, India based on their rating and average prices. Whenever a person is visiting a city they start looking for venues to visit during their stay. They primarily look for places based on the venue ratings across all venues and the average prices such that the locations fit in their budget. Thus, our aim here is to identify places that someone can visit.

Here, we’ll identify places that are fit for various individuals based on the data collected from the Foursquare and Zomato APIs and information retrieved from Data Science application.

Data Discussion

The data has been collected from two APIs, Foursquare API and Zomato API. The first step was to search for venues within a radius of 4 Kilometers from Chandigarh’s center point. After extracting over 120 locations using the Foursquare API, the latitude and longitude values were used to fetch the venue details using Zomato API.

Venues retrieved from Foursquare API
Venues retrieved from Zomato API

We see that some venues overlap while other venues are way off. Thus, using careful analysis we decided to drop all corresponding venues from the two datasets that had their latitude and longitude values different by more than 0.0004. Once this was done, we observed that there were still some venues which were not aligning which could be categorised as follows:

  1. There are venues that have specific restaurants/cafes inside them as provided by Zomato API (Pizza Hut in Elante Mall).
  2. Two locations are so close by that they have practically same latitude and longitude values (The Pizza Kitchen and Zara).
  3. Some venues have been replaced with new venues (Underdoggs has now been replaced by The Brew Estate).

While it’s okay to keep the venues that belong to category 1 and 3, we shall drop venues in category 2. This left us with a dataset of 49 venues.

Methodology

As a first step, we retrieved the data from two APIs (Foursquare and Zomato). We extract venue information from the center of Chandigarh, upto a distance of 4 Km. The latitude and longitude values are then used to fetch venue rating and price from Zomato.

The data from the two sources is carefully combined based on the name, latitude and longitude values from the two sources. The final dataset would include the rating and price values for each venue.

Next, we analyse the data that we created based on the ratings and price of each venue. We identify the top category types. We identify places where many venues are located so that any visitor can go to one place and enjoy the option to choose amongst many venue options. We also explore areas that are high rated and those that are low rated while also plotting the map of high and low priced venues. Lastly, we cluster the venues based on the available information of each venue. This will allow us to clearly identify which venues can be recommended and with what characteristics.

Finally, we’ll discuss and conclude which venues to be explored based on visitor requirement of rating and cost.

Analysis

During the analysis phase, I explored the venue categories, the rating distribution of the venues and the price range across the map of Chandigarh.

Categories

As we extracted categories from the Foursquare API, identifying what type of venues are most popular in the city would really be helpful. We plot a bar chart for the same.

It appears that the majority venues in Chandigarh are either Cafes or Indian Restaurants. If a visitor is trying to explore either of them, they’re in luck.

Rating

Next, we will take a look at the ratings of the venues. As a visitor, you’d like to know the places that have good rated venues. We can plot a bar chart of the ratings of all venues and the count of each rating to see what is the average rating across all venues.

We see that the ratings range from 1.0 to 5.0. The plot reveals that maximum venues have a rating close to 4. The visitor might also be interested in knowing where actually are the high rated venues located.

Venues with different ratings

The venues that are either orange or red have a rating below 3 while the venues that are marked with green or dark-green have rating 3 and above. We can see that many high rated venues are located near Sector 35, and Sector 17. Elante Mall has venues with rating in the complete range. Also, the belt of venues from Sector 11 to Sector 7 and Sector 26 have high rating venues.

Price

Next, we explore the average prices of all venues for one person using a scatter plot along with the count of venues with that average price per person.

From the plot above, we can see that a large number of venues have an average price between Rs 200 and Rs 400. We can also plot the venues based on their price range and see which areas have what priced venues.

Venues with different prices

From the plot, we observe that venues near Sector 35 and Sector 17 are primarily lower priced. The venues near Sector 7 and Sector 26 have steep prices. Elante Mall seems to have a mix of both high priced and low priced venues.

Clustering

We will now cluster all these venues based on their price range, location and more to identify similar venues and the relationship amongst them. We’ll cluster the venues into two separate groups.

From the map, we see the two clusters:

  1. The first cluster (green) is spread across the whole city and includes the majority venues. These venues have mean price range of 1.71 and rating spread around 3.57.
  2. The second cluster (red) is very sparsely spread and has very limited venues. These venues have mean price range of 3.21 and rating spread around 4.03.

Results and Discussion

After collecting data from the Foursquare and Zomato APIs, we got a list of 120 different venues. However, not all venues from the two APIs were identical. Hence, we had to inspect their latitude and longitude values as well as their names to combine them and remove all the outliers. This resulted in a total venue count of 49.

We identified that from the total set of venues, majority of them were Cafes and Indian Restaurants. A visitor who loves Cafes/Indian Restaurants would surely benefit from coming to Chandigarh.

While the ratings range from 1 to 5, majority venues have ratings close to 4. This means that most restaurants provide good quality food which is liked by the people of the city, thus indicating the high rating. When we plot these venues on the map, we discover that there are clusters of venues around Sector 17, Sector 35 and Elante Mall. These clusters also have very high ratings (more than 3).

When we take a look at the price values of each venue, we explore that many venues have prices which are in the range of Rs 200 to Rs 400 for one person. However, the variation in prices is very large, given the complete range starts from Rs 100 and goes uptil Rs 1200. On plotting the venues based on their price range on the map, we discovered that venues located near Sector 17 and Sector 35 are relatively priced lower than venues in Sector 7 and Sector 26. A mix of low price and high price exists in Elante Mall.

Finally, through clusters we identified that there are many venues which are relatively lower priced but have an average rating of 3.57. On the other hand, there are few venues which are high priced and have average rating of 4.03.

  1. If you’re looking for cheap places with relatively high rating, you should check Sector 35.
  2. If you’re looking for the best places, with the highest rating but might also carry a high price tag, you should visit Sector 7 and Sector 26.
  3. If you’re looking to explore the city and have no specific criteria to decide upon the places you want to visit, you should try Elante Mall.

A company can use this information to build an online website/mobile application, to provide users with up to date information about various venues in the city based on the search criteria (name, rating and price).

Conclusion

The purpose of this project was to explore the places that a person visiting Chandigarh could explore. The venues have been identified using Foursquare and Zomato API and have been plotted on the map. The map reveals that there are three major areas a person can visit: Sector 35, Sector 7 & 26 and Elante Mall. Based on the visitor’s venue rating and price preferences, he/she can choose amongst the three places.

--

--

Data science and Machine learning enthusiast. Technical Writer. Passionate Computer Science Engineer.