
Introduction
The Metallica concert in Tartu (Estonia) was held on July 18th of 2019 at the Raadi Airfield at the backside of the Estonian National Museum (ERM). The event was sold-out to 60.000 people (ERR, 2019). The municipality suggested to visitors use public transportation including the bike-sharing system and to improve mobility in the city during the event the bike-sharing system included a virtual dock station called "Metallica parkla" where visitors can leave their bikes near the venue.
The bike-sharing system in Tartu is a success in smart mobility. Bikes are efficient and make mobility easy around the city. The subscription is affordable and 500/750 bikes are electric. Stations are spread out around the city so you can move wherever you need. The system was launched on June 2019 and after one month of usage, it was used to improve mobility in a 60k people event. The system gives real-time data of users locking and unlocking bikes. No doubt, Tartu’s bike-sharing system is a success. If you are willing to know more about the system you can check the Tartu Smart Bike website.
I was curious about what the bike-sharing system data looks like so I started looking for it. When I found it and after a quick exploration it was a surprise for me to notice that July 18th of 2019 was available. I had no doubt to check how the bike-sharing system worked during the Metallica concert and this is the reason why this story was written.
Objective
- To visualize the bike-sharing system dynamics on July 18th of 2019 in Tartu highlighting Metallica concert moves
Data
Bike-sharing system tracking data
The data on bike-sharing system use on July 18th of 2019 can be found in the Estonian Open Data portal and it is available for public use. It actually provides more days in July and the data is given in two different files: locations for GPS tracking data, and routes for Origin-Destination between bike stations. Files are given in .csv format.
- LICENSE – The data is under the license Creative Commons Attribution-ShareAlike 3.0 Unported (CC-BY-SA 3.0) and we are free to share, copy, redistribute, adapt, remix, or built upon even for commercial purposes.
Outcomes
- A web map with KeplerGl showing the bike movements to the Metallica concert’s bike station

- A static map with QGIS showing the bike movements to the Metallica concert’s bike station

Join my stories
If you want to see more of my tutorials join me on Medium:
Analysis
To start this Python workflow start cloning this repository with the necessary data. It contains an empty notebook as a template that you can use to follow this practice.
Clone this Repository:
In this practice, I am using Anaconda free version. Be sure that your environment has installed the needed libraries. I used in this analysis: geopandas 0.9.0, movingpandas 0.8rc1, keplergl 0.3.2 , and NumPy 1.21.5.
Let’s import the libraries
# for Geospatial analysis
import geopandas as gpd
import pandas as pd
import numpy as np
# for trajectories visualization
import movingpandas as mpd
# for visualization
from keplergl import KeplerGl
# for Folders
import os
import warnings
warnings.filterwarnings('ignore')
Now, let’s create a folder for our final map and outputs
if not os.path.exists('root'):
os.makedirs('root')
if not os.path.exists('output'):
os.makedirs('output')
Visualizing the GPS tracking data
We will start reading the file with the GPS tracking of the bike users.
# reading the locations file
locations = pd.read_csv(r'data/locations_20190718.csv')
locations.head()
When you check locations['coord_date'].unique()
you will notice that the data from July 18th contains more days. So, we need to remove those extra days because we are interested only in the 18th.
# filter only the desired date
locations = locations[locations['coord_date'] == '2019-07-18']
If you check again locations['coord_date'].unique()
you will see that now we have only July 18th.
Then, we need to set up a timestamp column named t
that the library movingpandas is going to use to create the trajectories.
# Arrange timestamp column
locations['timestamp'] = locations['coord_date'] + ' ' + locations['coord_time']
# Create timestamp type
locations['t'] = pd.to_datetime(locations['timestamp'], utc=True)
To be sure that the route id is not confused with numbers and then suddenly decimals start appearing we will change the route codes to strings.
# add string code to bike
locations['route_code'] = ['r-' + str(code) for code in locations['route_code']]
Then, we will get a clean table selecting only the columns that we need.
# Get the needed columns
locations = locations[['route_code', 'longitude', 'latitude', 'timestamp', 't']]
Quickly, we will check if there are nan values that we need to remove. Fortunately, there are no nan values here.
locations.isna().sum()
Then, a quick view with locations.head()

Let’s print some information about the data.
# some info
print('There are in total {} unique bike routes'.format(locations['route_code'].nunique()))
print('The first bike was used at {}'.format(locations['timestamp'].astype(str).min()))
print('The last bike was used at {}'.format(locations['timestamp'].astype(str).max()))
Create trajectories with Movingpandas
To start, we will create a new GeoDataFrame adding a geometry object.
# create a geodataframe
locations_gdf = gpd.GeoDataFrame(locations, geometry = gpd.points_from_xy(locations.longitude, locations.latitude, crs="EPSG:4326"))
locations_gdf.head()
Then, we will create a TrajectoryCollection object with our GPS tracking GeoDataFrame.
%%time
# Create a Trajectory Collection
bike_collection = mpd.TrajectoryCollection(locations_gdf, traj_id_col='route_code', t='t')
type(bike_collection)
Adding station info
Our next step is to add the station name to every location in the GPS tracks. To reach our objective we need to highlight the trajectory that ended in "Metallica parkla" which corresponds to the new venue’s station where Metallica held the concert.
Let’s first read our route info.
# read the route info
route_info = pd.read_csv(r'data/routes_20190718.csv', encoding='utf-8')
route_info.head()
Let’s be sure that we have the data only for one day with print(route_info['unlockedat'].unique())
Then, we will subset only the columns we need and we will change the route code as we did with the GPS tracking. Switching from number to string.
# get the needed columns
bike_route_info = route_info[['route_code', 'startstationname', 'endstationname']]
# add the route id as our workflow
bike_route_info['route_code'] = ['r-'+ str(code) for code in bike_route_info['route_code']]
bike_route_info.head()
Add to trajectory
Here we are going to create a new GeoDataFrame named bike_moves
where we add GPS tracks with station information. The linkage is done between the route code we transformed from number to string.
The logic is to get every trajectory from the TrajectoryCollection in a loop, then send them to our bike_moves
GeoDataFrame.
%%time
# adding bike route info to every GPS tracking
# bike-route-metallica
bike_moves = gpd.GeoDataFrame()
# total routes
total = locations_gdf.route_code.nunique()
n = 0
# loop in every trajectory by bike route id
for traj in bike_collection.trajectories:
n = n+1
# info
print(f'Adding info {n}/{total}n')
# join bike route info
traj_moves_info = traj.df.merge(bike_route_info, on='route_code', how='left')
# add to bike moves
bike_moves = bike_moves.append(traj_moves_info, ignore_index=True)
# let's check the result
bike_moves.head()
You will see the process of 5253 trajectories.
We will also create LineStrings out of the GPS tracking for visualization in QGIS. The function we will use from Movingpandas is to_traj_gdf()
%%time
# bike-routes-metallica
bike_lines = gpd.GeoDataFrame(crs=4326)
# total routes
total = locations_gdf.route_code.nunique()
n = 0
# loop in every trajectory by bike route id
for traj in bike_collection.trajectories:
n = n+1
# info
print(f'Adding info {n}/{total}n')
# creating a LineTrajectory
traj_line = traj.to_traj_gdf()
# join bike route info
traj_moves_info = traj_line.merge(bike_route_info, left_on='id', right_on='route_code', how='left')
# add to bike moves
bike_lines = bike_lines.append(traj_moves_info, ignore_index=True)
# remove a duplicate
del bike_lines['id']
# let's check the result
bike_lines.head()
This file is ready for visualization in QGIS. We will save it to make a map later.
# save the result
bike_lines.to_file(r'output/bike_routes_metallica_concert.gpkg', driver='GPKG')
Prepare datasets for visualization
We will create two functions that add an attribute for highlighting the Metallica moves and another one that gives a weight for size.
# add code for those involved in Metallica concert
# function that defines the "Metallica parkla" as end
def metallica_code(station_name):
'''
Evaluates if station_name is metallica parkla and return a code
station_name <str>
'''
if station_name=='Metallica parkla':
return 'Metallica'
else:
return 'Bike user'
def metallica_weight(station_name):
'''
Evaluates if station_name is metallica parkla and return a code
station_name <str>
'''
if station_name=='Metallica parkla':
return 5
else:
return 1
Then, we will use the functions only for the ending station. Our interest is to see how were the moves to the concert. If we add also "Metallica parkla" as a start we might have an overloaded visualization.
# apply function
# end
bike_moves['viz_code'] = bike_moves.apply(lambda row: metallica_code(row['endstationname']), axis=1)
bike_moves['weight'] = bike_moves.apply(lambda row: metallica_weight(row['endstationname']), axis=1)
Once we have added the visualization attribute we can check it out with print(bike_moves['viz_code'].unique())
Adding the venue’s location
We will create a new GeoDataFrame with the location of the Metallica concert venue.
First, let’s import Point geometry objects from shapely
from shapely.geometry import Point
Then, the GDF.
# create the venue location
venue = gpd.GeoDataFrame(columns = ['name', 'geometry'], geometry = 'geometry', crs = 4326)
venue.at[0, 'name'] = 'Metallica venue'
venue.at[0, 'geometry'] = Point(58.397144, 26.752595)
venue.head()
Visualization with KeplerGl
We are going to use KeplerGl for Jupiter notebooks to visualize our spatial and temporal data.
Let’s start creating an instance
# create and instance
m = KeplerGl(height = 600)
Let’s add our data. I recommend adding data in two different cells.
# add the data
m.add_data(bike_moves, 'Bike Moves')
m.add_data(venue, 'Metallica venue')
We will import a map configuration from the configuration.py
file included in the repository. A configuration is a bunch of details that style the map visualization in KeplerGl. I added my own creation in the .py file for this story.
from configuration import config
Then, we will simply save the file.
m.save_to_html(file_name='root/metallica_moves.html', config=config)
Once we open we will have our final map

How to add your own configuration to KeplerGL?
After you have added the data to the map instance open it in Jupyter notebook.
m
You will see a default visualization of your data.

Configure it in KeplerGl as you want. Then, check the configuration in a new cell.
m.config

Then, please copy and paste the configuration and use it in the m.save_to_html()
function as we did before.
Visualization in QGIS
We saved already a file in the output
folder with the trajectories as lines. Simply, open QGIS, drag the file, and make your own map.
If you highlight the ending station "Metallica parkla" it should look like this one

Discussion
Something to point out is that our visualization shows the movements that started in other stations and ended in the venue station "Metallica parkla". But, it does not necessarily mean that they were the only people that went to the concert. There is still the "ERM" station that was in the same place at the venue so most probably many people also left their bikes there. Also, the visualization shows that even movements that ended in other stations were closer to the venue.

The visualization just gives an idea of how the virtual dock station worked during the Metallica concert in the city.
Conclusion
The dissemination of the virtual station "Metallica parkla" gives a quick understanding of how the bike-sharing system station worked in Tartu during the Metallica concert. We can see clearly that during the event hours in the evening there is a peak in the usage of the "Metallica parkla" station. The bike-sharing virtual station was properly functional for those people that wanted to have access to the concert.
The visualization is not showing necessarily only the people that went to the concert because it is also possible that some of them left the bike in the other venue station "ERM". There are still many possibilities that could happen not even only "ERM" and "Metallica parkla" but they are not reviewed in this story.
If you have questions do not hesitate to leave a comment here.
Connect in LinkedIn here: Bryan R. Vallejo
Get access to my tutorials here: Join my stories