Social Network Analysis with Genshin Impact

Social network analysis in python, applied to Genshin Impact.

Kaili
Towards Data Science

--

Photo by Omar Flores on Unsplash

0. What and Why — Context and Purpose

Genshin is an open-world action role-playing game. On top of being named one of Google Play’s best games of 2020, it is also one of the top grossing mobile games in the first half of 2021. Packaged with game is the lore, where the player and other characters weave in and out of story lines.

The way different characters in the game mention each other inspired this analysis of the social network of Genshin Impact’s characters, to answer questions such as:

1. Pre-Analysis: Data Sources and Scope

  • Data is manually collated from the Genshin Impact fandom wiki
  • Only playable characters are considered for analysis as they tend to be the characters most players are familiar with
  • Character A is indicated to be connected to character B if A “mentions” B in either A’s story quest or voice lines (more on later)
  • Note: the data for this exploration was collected on 28 July 2021, the current banner (newest character) is Kamisato Ayaka. Inazuma characters not currently released are not included (this was written before knowledge of Inazuma’s quest line)
  • The data and full code used for the analysis can be found here. For brevity, code might be altered or omitted for inclusion in this writing.
  • Library used for the analysis is NetworkX (imported as nx)

2. The Genshin Social Network — Directed

A directed social network means ties between characters are directional and may be not mutual.

Building the directed social network

Fig 1. Genshin Directed Social Network
  • The characters are coloured according to their associated nation: green for Mondstadt, orange for Liyue, blue for Snezhnaya, purple for Inazuma, grey for the traveler (player) who is not associated with any one nation

Network Observations

  • There are a number of ties that are not reciprocated (one-way):

An example: Kadehara Kazuha → Kamisato Ayaka but there does not exist a tie going the other way, suggesting that Kazuha mentions and hence is implied to know Ayaka but Ayaka does not know Kazuha. Similarly for Zhongli → Venti.

  • Some connections are important bridges to other nations:

The only tie between Liyue and Snezhnaya is between Zhongli and Tartaglia. Beidou’s tie to Kazuha is the only tie between Liyue and Inazuma.

This might predict how the player would be introduced to newer regions — by having the player interact with these bridging characters (e.g. Kazuha, Tartaglia) first.

  • The Mondstadt characters seems to be a tighter cluster (more ties) than the Liyue cluster, which might suggest that more characters in Mondstadt know each other than the characters in Liyue.

Character Importance in the Network

To understand how important, or central a character is in this network, centrality measures are used. Centrality measures determine how central, or important a character is in our network. The difference between each centrality measures is how “importance” is interpreted.

A directed graph has 3 centrality measures that can be applied: in and out-degree centralities and page rank centrality.

In-degree Centrality

refers to how many people are “sending ties” to a particular node. In our network, it refers to how many characters are mentioning a particular character and is a way to measure how popular a particular character is. A higher in-degree means that a character is mentioned a lot by others and is inferred to be more popular.

Fig 2. Top 10 Ranking of characters with highest in-degree centrality
  • The average in-degree centrality is 7 — each character has about 7 other characters mentioning them.
  • The most popular characters are mostly from Mondstadt, with Jean, Lisa, Kaeya and Klee nearly doubling the average.
  • With knowledge of the storyline, this result makes sense. The top 5 most popular characters are part of the governing body of Mondstadt and are frequently seen together.

Out-degree centrality

is the reverse of in-degree centralityt, referring to how many characters a particular character is directly sending ties to. In our network, this is a way to measure how many characters a particular character knows.

Fig 3. Top 10 Ranking of characters with highest out-degree centrality
  • The average out-degree centrality is 7 — each character mentions about 7 other characters
  • The most sociable values here do not show variance as drastic as for in-degree
  • Interestingly, the average (when the decimals are included) in-degree and out-degree for the network are almost exactly the same, but the distribution of in-degree and out-degree counts differ.
Fig 4. Distribution of characters’ in and out-degree centralities
  • Out-degree shows a distribution closer to a normal distribution

Page Rank Centrality

adds an additional consideration for influence — what matters more than the number of characters mentioning a character (in-degree) is how many influential ties are in-coming. A higher page rank centrality value suggests that a character is connected to more influencial characters and has a wider reach beyond their direct ties.

Fig 5. Top 10 Ranking of characters with highest page rank centrality
  • Most characters on this list have been previously listed as having some of the highest in-degree centrality values, so they are not only popular but also know other well-known characters

3. The Genshin Social Network — Undirected

For further analysis of the network, it becomes necessary to convert the directed graph to an undirected one. a major consideration in this process is the treatment of one-way (asymmetrical) ties. An undirected tie implies that a tie is mutual (go both ways), which in our context means that the 2 characters know each other (enough to mention each other). As such, it would be more meaningful to drop all one-way ties in our undirected network and only consider mutual (reciprocated) ties between characters.

Setting up the undirected network

Fig 6. Genshin Undirected Social Network

Network Observations

  • Requiring ties to be 2-way caused the graph to have isolates which are characters not connected to other characters (poor traveler)
  • This network looks sparser in ties compared to the directed one

Isolates are unlikely to be central to this network (mainly because they are not part of it), they will be removed moving forward.

Fig 7. Genshin Undirected Social Network without Isolates

Character Importance in the Network

Having the network be undirected allows the usage of centrality measures beyond degree centrality: closeness, betweenness, eigenvector centralities.

Degree Centrality
places importance on how many characters a character is directly connected to. A simplistic way to gauge a character’s importance in the network — character is important if they know a lot of other characters, but still a good basis for comparison with other measures of importance explained later on.

Fig 8. Top 10 Ranking of characters with highest degree centrality
  • Diluc was previously ranked on top with having high in-degree centrality is no longer on the list once mutual ties is enforced, suggesting that although he is mentioned by many characters, he doesn’t seem to know them (quite in-character oddly…)

Closeness Centrality
places importance on how many characters a character is indirectly connected to, by measuring how many times it is in the shortest path between other characters. A more important character here is one that can reach more characters via the shortest path — able to spread information to the most characters in the network the fastest.

Fig 9. Top 10 Ranking of characters with highest closeness centrality
  • Majority of characters on this list do not have the highest degree centrality (know a lot of other characters) but are able to reach the most number of other characters the fastest
  • Looking back at the undirected network (fig 7), characters with higher closeness centrality values are ties between Mondstadt and Liyue — Diona, Xiangling, Albedo, Xingqiu, Eula, Yanfei, which explains their high importance given that these characters have the connections to spread information to a wider audience beyond their own nation

Betweenness Centrality
puts importance on information brokers in the network, the characters who are “in-between” the pairs of characters that want to reach each other. Characters with higher “in-betweenness” allow these characters more control of information flow, which is what makes them important.

Fig 9. Top 10 Ranking of characters with highest betweenness centrality
  • Note how compared to closeness centrality, betweenness centrality produces a different ranking list of characters
  • Once again, characters with higher betweenness centrality values are those that connect Mondstadt to Liyue — Diona, Xiangling, Albedo, Xingqiu, Eula, Yanfei, indicating that they are important brokers of information between the 2 nations
  • When considered with the undirected graph (fig 7), Hu Tao is the only tie Qiqi has to the rest of the network, implying that Qiqi is dependent on Hu Tao to gain or spread information to the rest of the network
  • The above point can also be observed for Tartaglia and Zhongli — Tartaglia is dependent on Zhongli to access and distribute information in the wider network

Eigenvector Centrality
defines importance as being connected to other influential characters in the network — being connected to well-connected characters. Higher importance is placed on ties to more well-connected characters rather than simply counting each tie as equal (which degree centrality does).

Fig 10. Top 10 Ranking of characters with highest eigenvector centrality
  • This ranking looks really similar to the degree centrality one above, except that this list only contains characters from Mondstadt, this implies that these characters do not just have a lot of ties, these ties also tend to be of a higher quality (to characters with a higher level of influence in network)
  • Similarly, characters who rank high in degree centrality but fail to rank high in eigenvector centrality (e.g. Ningguang) have a lot of ties but these ties are to characters with some distance from centres of power

Conclusion: Who is the who?

Who is the most popular person in Teyvet?

Jean

  • Highest in-degree centrality (directed network)
  • Highest degree centrality (undirected network)
  • Difficult to deny that a lot of characters know Jean, enough to talk about her
  • In the storyline as well, she is one of the first characters the player would hear about before meeting her

Who is the most important character in Teyvat?

Kaeya

  • This is less easy to answer, what is considered important might be subjective and requires a more holistic look at each node’s role in the social network
Fig 11. Top 10 average ranking of characters
  • A simple way would be to average each character’s rank across the various undirected network centralities, which points to Jean being the most important
  • However, Kaeya has a more advantageous place in the social network as he is in the top 10 rankings for all the applied character importance measures and is not just popular but also an important character that can control the flow of information in the social network

TL:DR

In Teyvat, you should know Jean, but if you need information, look for Kaeya.

--

--