Analyzing Houses for rent in Mumbai using QGIS and PostGIS Functions

A spatial data analysis using QGIS

Aveek Das
Towards Data Science

--

Image by Author

Introduction

Mumbai is one of the most populous metropolitan city in India with an ever-increasing population of over 20M. It is also considered as the economic capital of India due to the presence of a large number of industries and organization head quarters within the city. As a result, people from all over the country move to Mumbai with various prospects of studies and jobs and that makes the real estate sector boom with sky rocketing prices of apartments to rent or lease.

In this project, we will try to analyze the apartments in Mumbai for rent or sale in 2020 and draw conclusions based on some predefined conditions using PostGIS and QGIS.

Motivation

A few months back, my cousin was trying to find a suitable apartment for sale within the city. She had few conditions to narrow down her search from the entire city of Mumbai to some specified locations. The conditions were as follows.

  1. She wants to understand what are the rates of the properties in order to decide on her budget.
  2. The apartment should be within 5kms radius of Mumbai International Airport (as she works at the airport).
  3. The apartment should be near the beach and sea facing.
  4. Finally, she also wanted to find the fastest route to the Mumbai Naval Dockyard as my BIL works there.

This intrigued me to work on this project and find them a good apartment which meets most of her criteria.

Implementation

To start with the analysis, I have imported the Mumbai House Price dataset from Kaggle. This dataset contains details for various flats located in Mumbai city. The description of the columns available in this dataset are as follows.

  • area : Floor area of the property
  • bathroom_num : Number of bathrooms available
  • bedroom_num : Number of bedrooms available
  • city : City in which property is located
  • desc : Text description of the property
  • dev_name : Name of property developer
  • floor_count : Total number of floors in building
  • floor_num : Floor on which property is located
  • furnishing : Furnishing status
  • id : Unique ID
  • id_string : Unique ID string used to scrape a particular HTML page element.
  • latitude : Latitude of location
  • locality : Locality in which property is located
  • longitude : Longitude of location.
  • post_date : Date on which property was listed on website.
  • poster_name : Name of poster
  • price : Price of the property
  • project : Name of the residential complex
  • title : Title of the property ad on the website
  • trans : Type of property transaction
  • type : Type of residential complex
  • URL : URL of the individual property
  • user_type : Type of user who posted the ad

I have downloaded the data as a CSV file and then wrote a small program in python to clean the data and load it into a table in PostgreSQL. You can find the notebook here.

I found few of the columns had NULL values in those and so thought of removing those columns in order to keep the analysis simple. After the data cleaning process, now the dataframe needed to be converted to a GeoPandas Dataframe in order to represent the spatial data within PostGIS. The CRS used here is EPSG:4326. The data is then loaded in PostgreSQL database with the table name mumbai_house_price_raw.

Analysis

We will perform the following analysis and then draw conclusions as we go along.

  1. Visualize the entire dataset on the map
  2. Analyze the apartments by price range
  3. Finding places within 5km radius of the Airport
  4. Selecting sea facing apartments
  5. Distance from Naval Dockyard

Visualize the entire dataset on the map

Now that we have the entire dataset in our PostgreSQL table, we can easily import it to QGIS and visualize the various houses available in Mumbai. First, let us import the OpenStreetMap into QGIS as the background layer.

Go to Web → QuickMapServices → OSM → OSM Standard.

Layer → Add Layer → Add Postgis Layers. Select the name of the table that is available in PostGIS and click Add.

All the data points from the PostgreSQL table will now be loaded into the QGIS layer and the points will be plotted on the map. For more understanding, I have also provided exact location of the Mumbai Airport and Naval Dockyard.

Analyze the apartments by price range

As you can see in the above graph, there are over 30000 apartments that have been displayed on the map. However, it would be interesting to see how the price range of these apartments vary with respect to the locality. Basically, we are trying to answer if the locality has an impact on the house prices.

For this, we have divided the house prices into 3 bins as follows.

  • Below 20K
  • 20K to 40K
  • Above 40K

I am going to add a column in the PostgreSQL table and update with relevant information about the bins.

Now, we need to refresh the layer that we have already fetched in QGIS. This will bring the new column ‘house_price_category’ that we have just created. Once the new column is in the layer, the next step is to display the points on the map based on this category.

Right-click Layer → Properties → Symbology → Categorize → Value = house_price_category → Color Ramp = Create Manual Color Ramp for 3 categories → Classify.

This will categorize all the points on the map into three bins and it will look like below.

From the above analysis, now we have an idea about what our budget should be while searching for a home in the city of Mumbai. While few of the houses are below 40K (green and blue), most of the houses are priced above 40K (red). To verify the above observation, I did a quick check using SQL to find the number of apartments in each of the categories.

As it can be verified, there are almost around 20K+ apartments in the range above 50K. So, if we keep the budget high, the chances of leasing a better apartment is more.

Finding places within 5km radius of the Airport

Mumbai Airport Location: Lon: 72.874374, Lat: 19.096713

  1. Create a query in PostgreSQL that will filter the names of the buildings within 5km radius of the airport → Create and store this in a view.
  2. Fetch the view created in the previous step into QGIS and plot the apartments.

The following is the plot of all the apartments within a range of 5kms from the airport.

Selecting sea facing apartments

Here, we need to find the apartments that are facing the sea. This is a tricky part as we need some kind of benchmark to decide whether the apartment is near or far away from the shore and also if the apartment is West Facing. For those apartments that are near to the beach but faces in other directions should be excluded from the analysis.

We can fetch the coastline from OSM. For that, we need to install the QuickOSM plugin and then select Vector → QuickOSM → QuickOSM. In the Quick Query tab, we provide the key as “natural” and value as “coastline” and select the Canvas Extent and click on Run Query.

This is imported as a part of Mumbai’s coastline in QGIS. We can export this coastline and save it as a SQL dump and load into the PostgreSQL table. You can see part of the coastline highlighted in black.

Now that the coastline is ready, we can create a buffer of 1000m on one side along the coast and it will show the apartments that lie within the buffer zone.

Go to DB Manager and write a simple SQL query as follows to fetch the buffer zone based on the coastline geometry data. Import this as a new layer in QGIS.

For the apartments to be sea facing, they needs to be faced westwards. The “desc” column contains that information and we will create a view to filter the records that match the word ‘west facing’ or ‘West Facing’ using REGEX. Import this new view into QGIS and plot the output.

The following is the final output.

Here, as you can see, we have a few list of apartments that are west facing and a buffer of 1000m running along the coastline. The apartments which lie on the buffer zone are our points of interest as these are both within 1000m of the beach and also west facing. So these apartments can be further investigated by physically visiting.

Final Analysis — Distance from Naval Dockyard

After visiting the places physically, they have decided to finalize one of the 2BHK Apartment in Juhu Tara Road, the details of which can be found below.

Now, we will try to find the fastest and the shortest route by car to the Naval Dockyard.

Naval Dockyard Location: Lon: 72.832096, Lat: 18.926257

For this, we are going to use the Open Route Service plugin known as ORS Tools.

Once the plugin is installed, the next step is to signup at openrouteservice.org and get an API key.

As you can see in the figure above, we have selected the starting point as the home and the end point as Naval Dockyard Junction. The above shows the fastest route to be taken while driving a car and the distance is around 19.5 kms.

Alternatively, we can also do a shortest route analysis to see if the fastest and the shortest routes are the same. Repeat the process above only by changing the parameter.

As you can see in the figure above, the shortest route is around 19.2 kms and is different from the fastest route at some points.

Conclusion

So far, this project, we have extensively leveraged QGIS and PostGIS functions to analyze houses in the city of Mumbai and came to a conclusion about purchasing or renting a home that meets our requirements. We have also calculated the fastest and shortest routes to Naval Dockyard from the apartment and understood that there are small variances between both the routes.

QGIS All Layers

Above is a representation of all the layers that we have analysed in this project and the legends are mentioned below.

  • Small Dots (Red/Green/Blue) — All the apartments available for sale.
  • Large Green Circles — Apartments within the 5KM radius of the airport.
  • Large Green Circles with Black Dots — Apartments within the 5KM radius of the airport and are facing west.
  • Black Line — The coastline of the area.
  • Orange Patch — Buffer zone of the coastline depicting 1000m from the coastline.
  • Blue line — Fastest route from the apartment to Naval Dockyards.
  • Green line — Shortest route from the apartment to Naval Dockyards.

The project along with the source code can be found at https://github.com/aveek22/cs621-spatial-db.

--

--

Data Engineer, Cloud Data Architect, Thinker, Amateur Photographer. Enjoys short walks for hot chocolates. Blogs @ https://datacloudmag.com