Acquiring a realistic traffic simulation by training a traffic model with real-world data.

In this article, I will discuss how I defined a traffic network, a traffic model, and how I trained it with flow data that I’ve collected.
Traffic networks can be represented as directed graphs and the vehicular load on the roads can be taken as numerical load values over the edges of the graph. While microscopic traffic models formulate interactions between individual vehicles, macroscopic traffic models consider the traffic flow. By defining a traffic model, a dynamic system will be formed over the graph and the traffic behavior can be simulated.
A collected and well-preprocessed real-world traffic flow data can be used to train the traffic model. The model then can be used to study the topologically weak parts of the traffic network to modify the network to improve the lower the congestion levels.
Traffic Network
Let’s give the preliminary definitions we need.

Special types of matrices can be used to hold the information of the attributes of graphs.

Graphs are objects that hold the information of the relationship between nodes. Line graph of a graph intuitively demonstrates the relationship of edges of a graph. Since the dynamics of traffic networks, such as traffic flow, considers the relationship between roads, we need to consider the line graph of the traffic network.

The traffic network can be considered as a directed graph G where each rode r is an edge and each junction j can be considered as a node.

To achieve realistic results, we need information on road types and road lengths. Then, we can define a capacity constant for each graph based on those values. Let _Hr be the road type, _lr be the road length of the road r, and define a function _λ(Hr) be the number of lanes of on road type _Hr. Then define the capacity of r as _C_r = λ(H_r)lr.
Capacity is the maximum number of traffic load a road can hold. Let each road hold a load value _L_r(t)∈[0,Cr]. Also, the congestion level of a road is defined to be _Lr(t)Cr.
Using Python’s osmnx package, we are able to import a real-world traffic network into a networkx graph. The package gathers the real spatial information from Open Street Map and uses it to generate a network. The library gives us the following information: the name, length, and road type.
For this article, I picked Kadikoy, Istanbul’s traffic network to play with.
import osmnx as ox
from networkx import *
# Defining the map boundaries
#Kadikoy kordinatlari
north, east, south, west = 40.9939, 29.0346, 40.9786, 29.0146
# Downloading the map as a graph object
G = ox.graph_from_bbox(north, south, east, west, network_type = 'drive')
draw(G, linewidths=1, node_color='w', node_size=15, arrowsize=5, pos=pos)
ax = plt.gca() # to get the current axis
ax.collections[0].set_edgecolor("#000000")

Now, let’s create the line graph of this network.
L = nx.line_graph(G)
draw(L, linewidths=1, node_color='w', node_size=15, arrowsize=5, pos=pos2)
ax = plt.gca() # to get the current axis
ax.collections[0].set_edgecolor("#000000")

Traffic Model
In the previous section, a traffic network was generated. To simulate the traffic flow on this network, a traffic model is needed. We need to define a model, which can obtain the information that each node holds and gives us the outcome. Generally the popular macroscopic models, such as METANET doesn’t consider smaller, in-city roads but only focus on highways.
The model we are going to define will consider each and every road on the network. The models are formulated using stochastic processes and the idea of random walks.



Let P be an n×n probability transition matrix and let π(t) be an n-dimensional non-negative vector that holds the initial load distribution over the nodes at time t. Then the distribution in the next time step is π(t+ 1) =(π(t)^T)P, where x^T defines x‘s transpose.
We have two assumptions. The first assumption is that at each time step, every traffic light is synchronized and at every junction, turn green for once at each time step. The second assumption is that the total load is constant. The second assumption is a temporary assumption and a time-based load control mechanism will be introduced later on.
Normally, the most common random walk is the one that is defined with respect to the number of degrees.
Let G be a graph and let A be the adjacency matrix and let D be the degree matrix of the graph. The probability transition matrix based on degree is defined as P = D^(-1)A, where P(i,j)=A(i,j)/deg(i).
But this model doesn’t respect the capacity values of roads and so we are not able to simulate the cascade effect of the traffic.
To prevent this and to be able to optimize the model, later on, the highway relation matrix M presented where M_(H_i,H_j) is the constant of flow going from a road with road type H_i to a road with road type Hj. First, let’s consider the road types. Let there is a flow from road r to p. If λ(Hr)> λ(Hp) the road becomes narrower as the cars travel, we are faced with a bottleneck. Bottlenecks make it harder for cars to continue without slowing down because the number of traffic lanes decreases and the cars are obliged to change lanes, and this causes an overall decrease in mobility. As the difference of road types becomes greater, the flow becomes slower. So,M(H_r,H_p) should be small. However, ifλ(Hr)< λ(Hp), then the flow won’t be slowed down because the lanes are actually increasing and the cars don’t have to slow down in order to change lanes. The load on road r is calculated by


And the probability transition matrix is:


Optimizing the Model
We want the model to act in a realistic manner. In the modeling phase, the model was given a road-type flow constant. In this section, we will optimize the model using those parameters. From HERE, a navigation company that offers traffic flow data with the use of their API, real-world traffic data is gathered for 1080 steps, with twenty-minute intervals. The collected data is used to optimize the model. The unprocessed data is processed and made to be a time series data that holds the congestion level for major roads in the Eminonu region. Eminonu is chosen, instead of Kadikoy, because it was the largest region that is offered by HERE with many different road types.
Let D_i(t) denote the real-world congestion level of road i at time t. Let the selected cost function be MSE. Let N be the number of data instances and K be the number of roads. Because L(t)is a recursive function, as t→ ∞, taking the derivative will be harder and harder. Also considering the real-world data depends on real-time, meaning at some times of the day the traffic levels will be higher or lower. To train the model, at each time step t+1, L_i(t) taken as D_i(t), and the Optimization is done for L_i(t+ 1), so L_i(t) is considered as a constant rather than a function and the optimization process made to be easier.
Let the time-series data to be:
D(0),D(1),D(2),D(3),D(4),D(5),D(6)…
The data will be grouped as (X,Y) where X is feature and Y will be predicted. Group the time-series data like this:
(D(0),D(1)), (D(1),D(2)), (D(2),D(3)),…
We are trying to find the 8×8 M matrix, which gives the minimum value for the cost function summed over the total simulation time τ. So every 64 variables should be optimized for. For time t+ 1, we want to optimize the following cost function:


where,

For each M_(a,b) where a, b ∈[1,8], gradient descent method is used for optimization.

Now, when we take the gradient, we need to calculate the minimum function at each time step and take the derivative of the part that is the minimum. Hence we have the following gradient,

The optimization is done for 30 epochs. At each epoch, we acquired a different matrix M. These 30 different M matrices are used to predict the values of Traffic loadedness in the next time step.
The left graph shows the different cost values for the first 10 generations of M matrices and the right graph shows it for the last 10 generations. For both graphs, the line plot is the cost values before the training, with the values that I gave intuitively, so the desired outcome is for cost values to stay below the line plot. It can be observed that the number of cost values that are below the line plot is increased after the model is trained for 30 epochs with 20 data pairs/data instances.

Thus, we can see that the trained model performed better than the untrained model in predicting the load distribution in the next iteration. Currently, another real-world data is being collected with smaller time intervals to train the model with better precision.