Member-only story

How to estimate data collection costs for your Data Science project

Tips for navigating between high barriers to entry and outdated APIs in online markets: a case study from the online rental market

Edoardo Romani
Towards Data Science
9 min readAug 6, 2020

Photo by Ali Yılmaz on Unsplash

(Notes: All opinions are my own)

Introduction

Data collection is the initial and fundamental step in any Data Science or Analytics project, and on which all following activities rely, from data analysis to model deployment.

With the pervasive presence of APIs and Cloud Computing, I am ever more intrigued in maximizing the efficiency and level of automation of data collection activities for both work and personal projects.

In the latter category, I have been interested in collecting data from online home-rental platforms in the UK market (Zoopla, RightMove, OnTheMarket, and similar) with the aim of extracting image and text data to be processed for use in machine learning models (for use cases such as prediction of a property’s price, extraction of key features from image-data to infer a listing’s true value, processing of customer reviews through NLP techniques, etc..)

In the following lines, I aim to discuss how to potentially go about:

  1. The identification

Create an account to read the full story.

The author made this story available to Medium members only.
If you’re new to Medium, create a new account to read this story on us.

Or, continue in mobile web

Already have an account? Sign in

Towards Data Science
Towards Data Science

Published in Towards Data Science

Your home for data science and AI. The world’s leading publication for data science, data analytics, data engineering, machine learning, and artificial intelligence professionals.

Edoardo Romani
Edoardo Romani

Written by Edoardo Romani

Working in Tech. Check out my Tech-focused newsletter, thedatanewsletter.io, or reach out at edoardoromani.com

No responses yet

What are your thoughts?