Graphs and Data Science

Graph Theory and Data Science

A topic intro with the Bridges of Königsberg

Jackson Gilkey
Towards Data Science
4 min readDec 10, 2019

--

Graph theory began all the way back to 1736 in the Prussian city of Königsberg . Back then the city was centered around two islands within the Pregel river, and these were connected with the mainland by seven distinct bridges. As the story goes, there arose a contest among local bridge lovers to see who could devise the optimal route to traverse each bridge across the Pregel while only crossing an individual bridge once. Thereby seeing all the bridges in their full glory in the minimum amount of time.

Fig. 1: a map of old Königsberg

While very simple to understand this problem proved to be very difficult to solve. Word of this puzzle spread until it caught the attention of mathematician Leonhard Euler. He was able to prove that the problem actually had no valid solution and in doing so laid the foundations of Graph Theory.

To easier understand his solution we’ll cover some Graph Theory terminology.

  • A Graph G(V, E) is a data structure that is defined by a set of Vertices (V) and and a set of Edges (E).
  • Vertex (v) or node is an indivisible point, represented by the lettered components on the example graph below
  • An Edge (vu) connects vertex v and vertex u together.
  • The Degree d(v) of vertex v, is the count of edges connected to it.
  • The Parity of a vertex refers to whether d(v) is even or odd.
  • An Eulerian Path is a path in a finite graph that visits every edge exactly once (allowing for revisiting vertices).
Fig 2: Example Graph with 6 Vertices and 7 Edges

Now back to Prussia! Lets convert the map of Königsberg and its bridges into a graph using what we’ve learned. In this graph-representation each landmass will become a vertex, and each bridge an edge.

Fig 3. Mapping Königsberg onto a Graph
Fig 4. Graph of Königsberg Bridges

Now lets try an Eulerian Path on this graph in the following order:

  • A → B → D
  • D → A → C
Fig 5. Eulerian walk attempt

I think that after we have traversed to vertex C is is easy to see that our path cannot be completed as we will end at either B or D.

Working through example walks like the one above Euler realized the heart of the problem concerned the parity of each vertex. That a traveler mid-walk would always need an entrance and exit edge from each vertex in order to continue their walk. So even parity is always needed in order to continue the walk.

Therefor for a traveler attempting an Eulerian walk along graph G the following must be true for it to be completed:

  • G must be connected
  • G must have exactly zero or two vertices of odd degree

Additionally if G has two vertices of odd degree then the walk must begin on one and end on the other odd vertex.

The field that grew out of this initial problem is only increasingly relevant in our highly connected world. Graph Theory can be used to represent and analyze a wide variety of network information and has numerous modern applications within Data Science.

Fig 6. Variety of Graph Representations of Networks

It has also been been fundamental to the development of numerous foundational algorithms from Google Page Rank to Netflix Content Recommendation.

Sources

https://math.stackexchange.com/questions/1173328/eulers-solution-of-seven-bridges-of-k%C3%B6nigsberg-in-layman-terms

https://en.wikipedia.org/wiki/Seven_Bridges_of_K%C3%B6nigsberg

https://www.learner.org/courses/mathilluminated/units/11/textbook/02.php

https://www.storyofmathematics.com/16th_tartaglia.html

https://cs.mcgill.ca/~wlh/comp551/slides/25-gnns.pdf

--

--

Data Scientist. Studied at Flatiron in NYC. Strong interest in Graph Theory, Mapping, GIS, NLP, Non-linear Modeling, Python, Urban Agriculture and Mycology.