The world’s leading publication for data science, AI, and ML professionals.

Exploring Data Gov SG API

Inspecting 8 real-time Data Gov SG API

Starter Notebook of Real-Time APIs provided by Data Gov SG

Photo by Safar Safarov on Unsplash
Photo by Safar Safarov on Unsplash

Notebook: https://nbviewer.jupyter.org/github/tonyngmk/DataGovSG/blob/main/DataGovSG.ipynb

1. Intro

Data.gov.sg makes available public datasets from over 70 public agencies in Singapore. To date, more than 100 apps have been created using the government’s open data.

Under its Developers section, there are real-time datasets made available such as weather and traffic conditions. This article serves as a starter for basic exploration of both usage and basis analysis of its underlying data made available through these real-time APIs.


2. Requests

The APIs can be called with a simple GET HTTP request. This can be achieved using the requests package from the base Python library.

import requests # HTTP requests (GET / POST)

Before we go on to the application, it may be helpful to know that this is equivalent to query string when we perform searches on the internet. For instance the bogus site of:

The bolded part of [?name=ferret&color=purple](https://example.com/path/to/page?name=ferret&color=purple) outlines the queries made for both ‘name=ferret’ and ‘color=purple’ joined by an ampersand ‘&’. We can hence use this to craft our API calls.

However, this would be a fair bit of string formulation work as we have to craft each URLs through various string methods. Thankfully, there is a more intuitive approach to this. We can store the parameters required in a dictionary and subsequently pass them into the get request function.

For instance, a common parameter received by these API is date in a string format of YYYY-MM-DD. We can use easily obtain this date format with strftime() method with its respective format.

today = datetime.datetime.today()
params = {"date": today.strftime("%Y-%m-%d")} # YYYY-MM-DD

After that, we can pass the dictionary as params in a GET request function. There are various attributes to be obtained such as HTTP Response Code, but we can obtain output in a dictionary format with .json().

requests.get('https://api.data.gov.sg/v1/environment/psi', params=params).json()

3. APIs

After discussing the technicalities to obtain the data, let us try to inspect and make sense of the data made available. Of the 8 real-time APIs available, there are 5 weather and 3 miscellaneous APIs. First, we will walk through the miscellaneous APIs.

3.1 IPOS Applications API

Intellectual Property of Singapore (IPOS) API allows a user to query Intellectual Property (IP) applications in Singapore. IP is fundamental for most businesses to protect creations of the human minds such as inventions, trademark, design or a brand.

In my opinion, this is probably ranked highest in coolness but lowest in usability. The fact that we can obtain approximately hundreds of photos of patents is interesting, but would likely not solve any real-world problems (in my opinion may be either data retrieving or ML wise).

Nonetheless, we shall try it. At first, the API returns empty data despite an HTTP 200 success response of today’s date (11 Jan 2021). This is likely a result of no IPs being applied in the date, and hence we require a method to search for non-empty data. We can use a naive approach to iteratively loop down the calendar for the past 365 years and breaking once items are found, the loop breaks at 9 Sept 2020 before attempting to inspect its data.

today = datetime.datetime.today()
dateRange = [today - datetime.timedelta(days=i) for i in range(365)]
print("Searching:")
for date in dateRange:
    params = {"lodgement_date": date.strftime("%Y-%m-%d")} # YYYY-MM-DD 
    ipos = requests.get('<https://api.data.gov.sg/v1/technology/ipos/designs>', params=params)
    print("No items found for: ", date)
    if ipos.json()["items"]: # Not empty
        print("Found first item at: ", date)
        break

There were a total of 3 patents applied by likely the same creator or business, and we can further inspect this by drilling into the documents fields.

Image by the Author | IPs Applied in 09 Sep 2020
Image by the Author | IPs Applied in 09 Sep 2020

Within each document instance, there is a field of URL. We can directly access each image’s content and obtain its pixel values with its respective URL using requests.get('url').content, before displaying it in the notebook using IPython.display.Image

# Obtain 5 images from patent
from IPython.display import display, Image
for i in range(1, 6):
    display(Image(requests.get(ipos.json()['items'][0]['documents'][i]['url']).content))

Due to copyright issues, I shall not paste the image here, but you can view it in the nbviewer and this can be publicly retrieved via the API as well.

3.2 Carpark Availability API

The carpark availability API shows available lots of approximately 2072 carparks refreshed on a per minute basis. Unfortunately, there were no geographical (geo) coordinates available as metadata and hence this cannot be plotted on a map. Nonetheless, 2000+ carparks would likely be too much to be plotted to interpret anything in a map plot.

Image by the Author | Data retrieved by Carpark Availability API
Image by the Author | Data retrieved by Carpark Availability API

The useful data would be carpark_number, lots_available, and total_lots. Unfortunately, I do not have any idea how to create a useful plot for this API (may be leave a comment if you know of any viable ways 😀 ).

3.3 Taxi Availability API

This API provides Taxi Availability with GeoJSON data. This was quite interesting for me as the API refreshes on a per 30 seconds basis, and hence total taxi available fluctuates per call as well. We can use the folium library in Python to plot these taxis as a Marker. As the number of taxis obtained are approximately 3000+, I only chose to plot the first 100 to reduce lag in the plot.

Image by the Author | First 100 Taxis Available by API
Image by the Author | First 100 Taxis Available by API

Honestly, I am not sure what this API can be useful to be extended into. There was no other information provided for the taxis available, and hence this API is primarily useful for only telling the:

  1. Amount of taxis available in Singapore
  2. Location of available taxi

I am sure that probably the public taxi group ComfortDelgro has probably integrated this information into their application (app).

3.4 Weather API

There are 5 distinct APIs made available, in which some can honestly be further aggregated together. Nonetheless, I’ll walk through it in case it can be useful to you.

Weather Forecast

This API provides 2-hourly, 24-hourly, and 4-day weather forecast. In my case, I used only the 2-hourly forecast but it’s the same for the remaining APIs. For instance, if datetime requested is 11 Jan 11 pm, the last instance would return the forecast for 12 Jan 1 am.

The weather forecast API provides a specific forecast for over 47 different areas in Singapore. I’ve tried to call this API at approximately 11 Jan 11 pm and the "Light Rain" 2-hourly forecast was returned across all areas.

Image by the Author | 2-hourly Weather Forecast
Image by the Author | 2-hourly Weather Forecast

Realtime Weather Readings

The name provided is "Realtime Weather Readings" but basically it only returns the temperature recorded in several MET stations. I was however quite delighted but not surprised to see a location plotted at my school (Nanyang Avenue), as I vaguely recall passing by a MET station behind the Administration Building in NTU.

Image by the Author | Realtime Weather Reading (Temperature)
Image by the Author | Realtime Weather Reading (Temperature)

PSI

Pollutants Standards Index (PSI) measures air quality. PSI is an important metric in recent times in Singapore, as our country has been susceptible to haze from various hotspots in Indonesia. There are a total of 5 areas of PSI reported: "north, "south", "east", "west", "central", and the "national" reading would be the max of all areas.

Image by the Author | PSI Readings in Singapore
Image by the Author | PSI Readings in Singapore

Multiple pollutant sub-indices are also made available, such as levels of carbon monoxide, nitrogen dioxide, and etc. We can attempt to plot a 24-hour plot of all pollutant sub-indices as created by the portal. First, we have to make 2 API calls of both today and yesterday’s date. Thereafter, we can concatenate both datasets sorted by date and obtain the last 24 instances for the following plot.

Image by the Author | Pollutant Sub-Indices Plot of Last 24 Hours
Image by the Author | Pollutant Sub-Indices Plot of Last 24 Hours

PM 2.5

The Particle Matters 2.5 (PM 2.5) are the levels of particles in the air that reduces visibility. This data has been made available in the PSI API, so I shall not provide redundant explanations.

Image by the Author | Redundant PM2.5 Plot
Image by the Author | Redundant PM2.5 Plot

UVI ¶

Ultra Violet Index (UVI) describes the level of solar UV radiation on the earth’s surface. This API is the most underwhelming API amongst all as it provides the least amount of data. After passing a date parameter into GET request for the API, only a text "Healthy" returns. There is no further geographical breakdown for this as well.

Image by the Author | UVI API
Image by the Author | UVI API

4. Conclusion

All in all, I’ve attempted to peek at the various APIs made available by our government so you do not have to. Truthfully, I believe these APIs are not at all helpful for creating any apps or solving any real-world problems. There may be a real need to protect data privacy for the more interesting datasets but this may come at a cost of stifling potential solutions or innovation.

I was wondering if data from apps such as SpaceOut of mall availability similar to the carpark availability where the number of patrons in a shopping mall can be disclosed (and maybe ‘total patrons’ for the respective mall since there is a "max" value). This can be helpful in possibly analyzing consumer sentiments and be a proxy variable and indicator for economic growth for various stakeholders such as prospective and existing shareholders of shopping mall stocks.


Related Articles