Member-only story
How to estimate data collection costs for your Data Science project
Tips for navigating between high barriers to entry and outdated APIs in online markets: a case study from the online rental market
(Notes: All opinions are my own)
Introduction
Data collection is the initial and fundamental step in any Data Science or Analytics project, and on which all following activities rely, from data analysis to model deployment.
With the pervasive presence of APIs and Cloud Computing, I am ever more intrigued in maximizing the efficiency and level of automation of data collection activities for both work and personal projects.
In the latter category, I have been interested in collecting data from online home-rental platforms in the UK market (Zoopla, RightMove, OnTheMarket, and similar) with the aim of extracting image and text data to be processed for use in machine learning models (for use cases such as prediction of a property’s price, extraction of key features from image-data to infer a listing’s true value, processing of customer reviews through NLP techniques, etc..)
In the following lines, I aim to discuss how to potentially go about:
- The identification…