The world’s leading publication for data science, AI, and ML professionals.

Use Data Science to locate your next restaurant

This blog provides an approach to use existing franchise locations to determine the best location for your restaurant.

Opening a Chipotle franchise in your city | Photo by Justin Snyder Photo on Unsplash
Opening a Chipotle franchise in your city | Photo by Justin Snyder Photo on Unsplash

This blog provides an approach to use existing franchise locations to determine the best location for your restaurant.

NOTE: For detailed analysis and stats, do check out the GitHub code!

While traditionally, figuring out where to open a restaurant is an important decision based on feasibility studies, various costs, and economic factors, relying on data science can provide much deeper insights. Living in Bloomington, IL, I went on to discover where to find the next best Chipotle outlet using their location data on Kaggle as seen below.

Fig 1. Chipotle franchises across the United States. Top locations (in Red) and other locations (in Blue) | Image by author
Fig 1. Chipotle franchises across the United States. Top locations (in Red) and other locations (in Blue) | Image by author

Geographical Analysis

The next requirement is to get the neighborhood information and here we aren’t provided this directly. Fortunately, various vendors offer API data like Google, Yelp, etc. exist. I recommend Foursquare, which offers about 50,000 free API credits daily and is easy to use, providing:

  • Venues-All venues in the area
  • Users-Profile details of the users
  • Tips-Ratings, photos, and comments by the users

We can iterate over our locations and query the Venues to get the relevant info as:

The results provide us with neighboring locations and their categories. The data can be collated as a Data Frame:

Data Pre-processing

The data can have some categories that are repeated too few times. Based on the dataset size, make a cut-off frequency (ex: minimum 5 times occurrence) and removing the other outlier venues. We can then have a DataFrame as seen below:

Fig 2. Initial DataFrame containing location and the queried neighborhood data | Image by author
Fig 2. Initial DataFrame containing location and the queried neighborhood data | Image by author

The objective is to determine the best venues for our restaurant theme and use that to shortlist similar areas in our city. The final preprocessing step is to find the top 10 most common venues for each location.

Fig 3. Common Venues in the neighborhood of shortlisted outlets | Image by author
Fig 3. Common Venues in the neighborhood of shortlisted outlets | Image by author

Clustering Venues

After shortlisting, one approach can be to enlist only the most common venues. The Clustering approach on the other hand can provide us the answer to the common themes for a region. Using K-Means clustering provides these venues.

For the present scenario, the comparison of the best locations vs other franchise locations was:

Table 1. Comparison of venue themes at best performing and normal Chipotle locations
Table 1. Comparison of venue themes at best performing and normal Chipotle locations

Both had similar top venues, the only distinguishing parameter was the proximity to higher population density near top locations.

Similar Clusters locally

The final task is to query similar locations within the selected city. Using the Foursquare API and querying for the above venues, the best locations can be shortlisted. Based on clustering, the feasible locations must lie close to other similar restaurants, especially Mexican venues.

The final decision on the location must also consider avoiding new franchises in close proximity to already present Chipotle locations.

Using the plot, there are 3 locations that should be avoided, leaving 2 areas having similar venues: the West and the South (Downtown) part of the city.

Fig 4. Location of Mexican restaurants in Bloomington (in green) with operating Chipotle locations (in red) | Image by author
Fig 4. Location of Mexican restaurants in Bloomington (in green) with operating Chipotle locations (in red) | Image by author

Business Decision

Based on the query, we compare the 2 regions to finalize where the restaurant can open. Final points to consider include the safety and price of a neighborhood.

Fig 5. The location in the West is safer and equally expensive | Image by author
Fig 5. The location in the West is safer and equally expensive | Image by author

Conclusions

The area in the West of the city seems the best option considering the following:

  • The South has a large cluster of restaurants where there will be high competition.
  • Other than that, the Downtown area (South) has a lot of restaurants in close proximity showing heavy competition.
  • On the other hand, in the comparatively safer West region, the restaurants are distant.
  • Also, the region is a good match since the other establishments are similar to Chipotle in terms of expensiveness and target population.
Fig 6. Location of all the restaurants in Bloomington. The highlighted region is the suggested spot for opening the next franchise | Image by author
Fig 6. Location of all the restaurants in Bloomington. The highlighted region is the suggested spot for opening the next franchise | Image by author

This was a pre-feasibility viability check for geographic location, and the final location can be decided based on the area’s actual land availability, pricing, and other factors.


Thank you so much for reading! I would love to hear your feedback and would love to reply to your queries promptly if any.


Related Articles