Exploring the Melbourne Real Estate Market using Tableau

Introduction to data visualisation in Tableau

Rishabh Arora
Towards Data Science

--

Introduction

Melbourne, being one of the most liveable cities in the world, has attracted a lot of individuals across the globe. Many of them dream of making this beautiful place as their home. My journey in the field of data science started with me moving to Melbourne hence I decided to do a comprehensive analysis of the city’s real estate market. I have always been fascinated by this industry. Hence in this article, I am going to take a comprehensive approach towards identifying drivers and helping potential buyers with data-driven decision making. Since this analysis was completed using Tableau, I will provide you with a few dashboarding tips as well.

“Fun fact — I did use this analysis while renting out a place. Hence this might be useful if you are new to Melbourne or planning to buy a property here.”

Photo by Denise Jans on Unsplash

About Tableau

Tableau is a powerful and rapidly growing data visualisation tool used by most of the data-savvy organisations. Self-Intelligence and numerous features to transform data into surprising business insights make Tableau one of the best BI tool.

Data Description

The dataset used in this project is the data of the houses sold in Melbourne from the period January 2016 to October 2018 posted by Tony Pino on Kaggle scrapped from publicly available results posted every week from Domain.com.au. Some of the data fields include Date, Price, Suburb, Region name, Landsize, Building size, Distance from CBD and others. (Kaggle)

Hypothesis and key questions

1. What is the effect of building size to land size ratio as we move closer to the CBD region? Does this ratio impact the price of the houses?

2. What will be the average price of the houses in different metropolitan regions of Melbourne in the second quarter of 2018?

3. In which month more houses are sold in Melbourne?

4. What are the top 10 suburbs of Melbourne by the price and the maximum number of houses sold?

Data Processing

To analyze the building to land size ratio and its relationship with the distance from the city, a dummy column (Ratio) was added to the data. Our exploration revolves around the Price, Distance from CBD, Suburb, Region type, and building to land size ratio. Null values in the Region column were imputed using the suburb information. Remaining records with null values were deleted using suitable filters in Excel. Quality checks on the data displayed properties with building size to land size ratio greater than one. Such records were dropped from the analysis.

Exploratory Analysis

To start with the analysis, I first plotted a choropleth map showing the average price of houses in different regions of Melbourne.

Choropleth Map:

A choropleth map is a thematic map in which different regions in the map shaded or patterned in proportion to a statistical variable that represents an aggregate summary of a geographic characteristic. (Wikipedia)

Figure 1. Variation of the average price of houses in different regions. Snapshot taken from the Tableau dashboard developed by the author.

The above visualisation shows that the prices of the houses located in the CBD and the eastern coastal region are higher as compared to prices in other regions.

I was interested in analysing the effect of the location of the house on building size to land size ratio. The distance of the house from CBD was used as the dimension of the location. I plotted a dual combination graph which consists of a bar graph and a line graph between “distance from CBD” and “average ratio”. The line graph shows the moving average of the ratio, in order to smoothen the results. For better understanding, the distance is visualised as the range of 5 Kms.

Figure 2. Variation of distance from the city with respect to the building to land size ratio. Snapshot taken from the Tableau dashboard developed by the author.

To get a clear picture of the variation of ratio with respect to the location, I also created a choropleth map of Melbourne showing the average ratio in different suburbs as shown in the below figure.

Figure 3. Variation of the average building size to land size ratio in different regions. Snapshot taken from the Tableau dashboard developed by the author.

The above visualisations provide us with the insight that as we move closer to CBD and the coastal region, the building size to land size ratio increases. It can be inferred that the houses located far from the city have more unoccupied land space in the house for the front yard and backyard than the houses located in the city and near the coastal region. The major reason for this can be considered as lack of space and high prices of houses in the CBD region.

Now let us look at the monthly trends of the house sales in Melbourne. I have plotted year-wise pie charts of the house sales to observe the monthly trends as shown in the below figure. Since the complete data for the year 2018 was not available, we will be visualising the results for the year 2016 and 2017.

Figure 4. The monthly sales distribution for the year 20116 and 2017. Snapshot taken from the Tableau dashboard developed by the author.

It can be observed that the maximum sale in 2016 is in the month of November whereas in the year 2017 the maximum sales are in the month of July. For both the years, it is observed the majorly the houses are sold in the period May to November. It can be inferred that houses are sold more in the winter season.

In order to find the top 10 suburbs by highest average price and the highest number of houses sold, I have plotted 2 bar graphs as shown in the below figure.

Figure 5. Top 10 Suburbs by highest average price and highest number of houses sold. Snapshot taken from the Tableau dashboard developed by the author.

The above visualisation shows that Kooyong is the most expensive suburb and Reservoir is the most preferred suburb.

Predicting the future average prices using Tableau’s Forecasting Model:

A bar graph showing the average price in the different months of the year has been plotted. The plot has been filtered for different regions. I have used Tableau’s Forecasting Model to predict the prices of houses in the second quarter of 2018. The model follows the trend of change in prices in quarters and months to determine the predicted price.

Figure 6. Prediction for the Eastern Metropolitan region. Snapshot taken from the Tableau dashboard developed by the author.

As per the above visualisation, the predicted average price in the Eastern region for the month April, May and June 2018 is $1.22M, $1.23M and $1.24M respectively. The prediction follows the trend of dropping of price as we jump from quarter 1 to quarter 2 in the year 2018. Furthermore, the model average outs the change from April to May and May to June in the year 2016 and 2017 and provides an upward increase in the prices of May and June.

Figure 7. Prediction for the Western and Northern Metropolitan region. Snapshot taken from the Tableau dashboard developed by the author.

In the western metropolitan and northern metropolitan region, the model again follows the trend of dropping of price as we move from quarter 1 to quarter 2 in the year 2017. The change in the price from April to May and May to June is averaged out and a similar trend for all the three months of quarter 2 is predicted for both the regions.

Figure 8. Prediction for the Southern Metropolitan region. Snapshot taken from the Tableau dashboard developed by the author.

For the southern metropolitan region, the data for quarter 1 is missing for both years 2016 and 2017. Hence the system is unable to follow the quarter change trend and predicts the average price of April 2018 same as of March 2018. Furthermore, as per the average change in April to May and May to June in the previous year, an upward increase is seen in May and June 2018.

Figure 9. Prediction for the Southern Metropolitan region. Snapshot taken from the Tableau dashboard developed by the author.

In the case of the south-eastern metropolitan region, the data for quarter 1 is missing for both the years 2016 and 2017. Moreover, the data of quarter 2 is also missing for the year 2016. Hence the model is unable to provide an accurate prediction and shows the flat value of $0.92M for all the three months of 2018 quarter 2.

Conclusion

This data exploration and visualisation helped us to gather a few useful insights about the Melbourne real estate market for aspiring buyers.

It was observed that the building size to land size ratio varies significantly as we move closer to the city area. The high prices and less space in the city encourage people to utilize the complete land in building the house.

Apart from the Southern metropolitan region, forecasting model has shown a decrease in the house prices as we move from quarter 1 to quarter 2 of 2018. Moreover, the winters season has been observed as the most preferred season for the buyers to purchase a home.

References

  1. Pino, T. (2018). Melbourne Housing Market. Retrieved from https://www.kaggle.com/anthonypino/data
  2. Choropleth Map. Retrieved from https://en.wikipedia.org/wiki/Choropleth_map

Wrap

Thanks for reading 👍. Hope you found this article insightful. If yes, please share it on your favourite social media platform. I hope to come up with some more advanced data visualisations.

I am currently pursuing my Masters of Data Science from Monash University, Melbourne. To provide any feedback or suggestion, please email me at rishabharora268@gmail.com. You can also connect with me on Linkedin or Facebook.

Keep Tableauing!

--

--