Hands-on Tutorials
When, where and how frequent bike rentals take place in a city bike-share system reflect the dynamics of the city. Public bike-share systems are part of the shorter distance transportations of humans in urban transport. That includes activities like commuting from home to work, going from the railway station to work, or from lecture hall to coffee shop, or for a pleasant ride around a park or along the waterfront (weather permitting).
Many of the public bike-share systems publish detailed usage data as part of open data initiatives. Descriptive data analysis has revealed spatial and temporal variations for London and for Hamburg, and intricate time-series forecasting has been done as well.
In early 2020 the Covid-19 pandemic struck. This altered transport dynamics and economic activities more broadly around the globe. Reduced air travel and novel challenges of container transport by sea are known examples. But how has shorter distance transportations of humans in cities changed?
Taipei in Taiwan and London in England are two cities with detailed and well-structured data on their bike-share systems. They are also in countries that have experienced different pandemic conditions.
How has bike rentals changed due to COVID-19, both with respect to aggregates and their nature, in Taipei and London, and what does the response say broadly about human action and its dynamics?
Baselines: Number of Events as Function of Where and When
A comparison between the two cities, or a before-and-after-covid analysis, begins with the creation of a baseline. A model of a baseline embodies what constitutes typical.
The difference of interest is therefore the change relative the respective baseline after the pandemic becomes part of the picture.
I use bike-share data from 2019 to construct the baselines.
One useful aggregation in building the baseline is the distribution of number of rental events for each bike rental station. A rental event is defined as a bike ride either starting or ending at said station.
The figure below shows for the twenty highest traffic stations in London on weekdays the distribution of number of rental events.

The figure below is from the London weekend baseline. Note that the stations on the horizontal axis are not necessarily the same as in the previous figure.

Another aggregation is of number of rental events by hour of the day for each station. As illustration I visualize this type of distribution for a station near the large Liverpool Station in London on weekdays.

This is one among many per-station distributions, and I will have more to say about these distributions below.
The equivalent illustrations of data aggregations for Taipei follow:
Distribution of bike rentals on weekdays for top traffic stations.

The same for weekends.

And finally, one illustrative example of the distribution of number of rental events for a specific station (臺大資訊大樓) for each hour during weekdays.

In absolute numbers Taipei bike-share is bigger, as well as more concentrated or efficient in terms rental events per station. Population and density are reflected in these relations, though the spatial distribution of bike stations is a factor too.
A relative difference between Taipei and London is that the former has fatter tails towards few rental events in a day. Extreme deviations from the median are in other words more common in Taipei than in London.
From inspection of a handful of the very low values, extreme weather events, like the typhoons Mitag and Lekima, appear to be the perturbations that causes the extreme deviations. London, not known for pleasant weather, is however rarely exposed to typhoon-level winds and rainfall. Climate at least makes activity throughout the two bike-share systems distinct.
These empirical distributions for London and Taipei are part of the baseline and are used below in order to compare data from 2020 and to test what impact COVID-19 has on the urban transport.
But first I will dig a bit deeper into the baselines and ascertain a useful latent dimension in the data.
Station Archetypes Mined From the Data
The box-plot illustrations above suggest the bike stations can be quite different in nature. That is, they differ not just in total rental events, but also how rental events distribute over time of the day, or weekday versus weekend. I have shown this in greater detail elsewhere for London.
This hints at that there are multiple social activities that overlap and manifest themselves in the per-station rental data – however to different degrees for different bike stations. These social activities in turn may be impacted differently by the COVID-19 perturbation.
In order to discover and visualize types of bike stations from the data itself, the Jensen-Shannon divergence is used. It is used as a metric to quantify how different a pair of per-station probability distributions are.
For example, on weekdays, the rental events at station 大安運動中心 in Taipei are on average distributed over the twenty-four hours of the day as shown on the left in the figure below. Note that the diagram is normalized.
The Jensen-Shannon distance to two other stations in Taipei are shown as well. The smaller of the two distances corresponds with a distribution shaped similar to 大安運動中心, while the greater of the two distances corresponds with a distribution of a more distinct shape.

The Jensen-Shannon distance puts a number on how distinctly the rental events distribute over the time of the day, disregarding the total number of rental events at the stations.
These curves are 24-dimensional points constrained to sum to unity— and the Jensen-Shannon distances tell us how they are relatively positioned.
In order to visualize these relative positions, dimensionality reduction is employed. I use the multidimensional scaling (MDS) method, which performs a non-linear transformation into a lower dimension with the objective of reproducing as much as possible the distances in the higher dimension in distances in the lower dimension.
The visualization is shown below for the top-25% highest traffic stations in London and Taipei.

Each disk corresponds to a bike rental station, green ones in London, purple ones in Taipei. Note that only the relative position matters, the exact value of the MDS coordinate is in itself not informative.
A few specific weekday station distributions are shown in the next figure.

Observations:
- The (almost) pure commuter stations of London are found in the upper left corner. Taipei does not have any stations that as cleanly follow the morning and evening commuters travelling to and from the city with train.
- The unique type of rental station distribution of Taipei is instead one with events far more spread out over the day, with especially high counts late in the day, see the lower right. Night markets? Late snacks in the warm evening?
- The median Taipei station is in many ways similar to the median London station, however, the former with a greater number of rental events taking place later in the day. This is the dominant distinction that shifts the clouds of purple and green points apart.
- Both London and Taipei there are a handful of stations where rental events are much more common late in the day than early in the day, see lower right. Fewer offices, more bars and theaters, are likely found at the corresponding locations.
What The Baseline Enables
The baseline is itself something that can be analyzed, explained and modelled along the lines I have sketched (or done in the past). Interesting differences in structure, economics and response to weather of the people in the two cities manifest themselves in the quantities. For the purposes of this text, I wish to use the baseline to contrast with 2020 data – the year of life with COVID-19.
The baseline embodies a number of distributions of what constitutes typical with respect to time and place in London and Taipei. Therefore, the baseline enables the quantification of how atypical a collection of bike rental data is along select segments of time and place.
Taipei Relative the Baseline: Weather + Holidays, but no COVID-19
The 2020 data for Taipei and the top-25% highest traffic stations are retrieved, transformed and aggregated into number of rental events for each station for each day of the year. These counts are in turn contrasted against the baseline.
That is, for a station, say 大安運動中心, and a day of the year, say Thursday April 23 2020, the daily count of rentals at said station is compared against the weekday baseline distribution for said station. The 2020 count value is mapped onto the normalized cumulative empirical baseline distribution.
Or to put it a bit more concretely: a 2020 count value that falls on the lower whisker in the corresponding box plot (as illustrated above) maps to 0.05; or if it falls on the median value, the mapping returns 0.5; or if it falls above the upper whisker, the mapping returns a value greater than 0.95, and so on. The percentile in the baseline distribution is thus returned for any given 2020 count value.
The heat map visualization below shows these values relative the baseline for each of the top-25% stations (rows) and each of the days (columns) for which there is published data in 2020 at the moment of my analysis.

The relation between colour and mapped percentile value is as shown in this colour bar. So saturated greens means being well below the median, saturated purple being well above the median, and so on.

In this glorious hodgepodge of green and purple I look for vertical or horizontal bands or patches of saturated green or purple. In other words, are there times or places where the Taipei bike sharing deviates in unison or persistently from baseline?
Outlier events in themselves are nothing strange – a 5% event happens after all once in every twenty instance without a perturbation to the system. If outlier events cluster in time or space, then something that begs to be explained has been discovered. Hence the search for bands and patches.
The three most pronounced vertical bands of deviation from baseline are annotated below.

Major holidays and extreme weather show up (can you see the faint trace of the Dragon Boat Festival?). One can dig deeper into this, including horizontal relations that suggests something has changed in the spatial distribution.
However, as far as the question I posed in the beginning, there is no evidence of COVID-19 impacting bike-sharing in Taipei. The official data of Taiwan reports the first cases in late January, with peaks in mid-March, with very few new cases by April. The cumulative number of cases in all of Taiwan has been extremely small (~450 by July). Any impact these relatively small numbers might have had on the short-range urban transport is within the range of what is typical for Taipei.
London Relative Baseline: Big Dip in March Before Lockdown With Quick Weekend Activity Rebound
The same mapping as done for Taipei 2020 data is done for London 2020 data and the result is shown in the heat map below.

It is abundantly clear that in mid-March a major deviation from baseline towards lower values take place. More or less all bike stations for several consecutive days have rental event counts that at baseline behaviour would happened at a frequency less than 5%, for most stations far less.
The precise point of departure from baseline is somewhere between Monday March 16, when the typical percentile throughout London is a bit on the lower side, and Wednesday March 18, when the typical percentile for the stations is in the 5% territory. Note that formal lockdowns of schools, bars and restaurants took place on Friday March 20.
The high-level view of the heat map also reveals periodic vertical bands of pink and purple throughout at least April and May. These correspond with Saturdays and Sundays. The social activity that interacts with weekend bike rentals is in other words only below baseline during the two weekends after the March lockdown— after that there is at least a rebound to baseline, maybe even above baseline.
Like Taipei, extreme weather shows up in the data as well. During the pre-COVID weekend of February 15-16 Storm Dennis thoroughly showered London and it thus explains the narrow band of saturated green in mid-February. The contrast in saturation and width of that band to the COVID-19 period puts the latter in an interesting perspective.
Activity Impacts and Rebound Not Distributed Equally Over Space
Horizontal bands of saturated colours show that rental events are not impacted equally throughout London. I will take a closer look at two prominent classes of horizontal bands:
- Weekday resilient locations: the handful of stations that between March 16 and April 10 are not reduced relative baseline.
- Weekday non-rebounder locations: the handful of stations that still remain far below baseline even in late June and early July.
The two subsets of stations are ascertained and highlighted in a baseline MDS visualization for London. The large green disks are weekday non-rebounders and the large purple disks weekday resilients; the other circles are the other stations.

Recall that the relative coordinates of the data points are indicative of how rental events distribute at baseline conditions over the hours of the day. Therefore, the fact that large purple disks are mostly to one side, and that large green disks are on the other side, implies that bike stations that interact with distinct types of social activities in the city are perturbed differently by the pandemic condition.
To illustrate, one of the highlighted stations that does not rebound in the summer, is at Bankside. The average number of rentals by hour on a weekday in the late June to early July interval is shown relative baseline distribution.

The 2020 data (red disks) are well below baseline in the morning and evening, and at the lower end of the baseline distributions in the middle of the day.
Another illustrative example, one of the highlighted stations that is not reduced in mid-March to mid-April, is in Battersea Park near Queen’s Circus.

The 2020 data (red disks) are mostly at the upper end of the baseline, morning hours the exception.
Both illustrative examples suggest the pandemic condition not only impacts different locations differently, but different times of the day too. This is further proof bike rental events interact with at least two distinct social activities (likely more), which differ in how they distribute over time and space – and importantly, differ in how said activities are perturbed by the conditions of the COVID-19 pandemic.
Urban Dynamics and Prospects of More Models
I find it intuitive that, say, commuting to work and going to a park for a relaxing stroll, should exhibit different types of spatiotemporal features and different responses to the extreme conditions of a dangerous infection disease. It certainly looks like the London data conforms to this intuition.
But what of other social activities that play out over shorter times and in more concentrated spaces?
The bike-share data can be used to discover additional relations, especially when cross-referenced against other types of data indexed on geography. A model of aggregate human responses to pandemic conditions could as well be tested against changes in bike rental events throughout a city. People moving to and from places is still central to life and action, digital technology notwithstanding.
I write this in early December 2020 when a second wave of infections is still in full swing in London (but not in Taipei), and when – thankfully – the first doses of vaccine have begun to be provided to persons in Britain. Unfortunately the data I use for my analysis is published with a few months delay. So what response the new pandemic condition has at this very moment on short-range human transport in London and Taipei (or other cities with high-quality open data) is something I cannot tell.
Still, stringing together a few real-time web-APIs, data parsers and transformers (because data formats are as variable as ever), cloud databases and streaming data visualization, are within the possible. As I have shown, such software and data engineering can be helpful as prescriptive action is considered, and insightful as we image what dynamic pathways a complex urban environment might spawn when a major deviation from status quo arises.
Footnotes
- Raw data gathered from website for Taipei and London. Cleaned up and documented data available as a Kaggle dataset.
- The raw data is differently formatted and not without faults. The data wrangling, along with the data analysis done are available in one of my Github repos.
- All visualizations are done with the Bokeh library, colour palette PRGn, and the code used to create them is available in the same Github repo.