Social media has completely revolutionized the information landscape. We are more connected to each other than we have every been in human history. News stories can reach us in an instant and ideas spread across the globe in days, but how does this work? How does information spread and can we model it? In this article, we will cover the theory behind information spread and use networks to model it.
In this article we will:
- Learn the basics of graph and network theory
- Overview information diffusion and social contagion
- Build a computational framework to simulate the spread of ideas
Graphs and Networks
What are Graphs and Networks?
A graph is a mathematical structure that shows the relation between objects. It does this by representing each object as a vertex that is connected to other vertices with edges that show the relationship between them.
There are many types of graphs including weighted graphs where all of the edges are given values to not just show relation but strength of relation and directed graphs where the edges have a direction to them to show directed relationships. The varying types of graphs make them a powerful tool to show the relationship between entities which we will discuss.
Conceptually, graphs and networks are identical and in most cases the terms can be used interchangeably. Though, by convention, graph is used to refer to the abstract mathematical concept and network is used for applied contexts. For instance, we used the terms "computer network" and "social network" even though computer or social graph would convey the same thing. For our purposes we will say that a network is an applied graph and the only difference in terminology we need is referring to vertices as nodes instead.
Graph Theory
Graph theory.) is the branch of mathematics that studies graphs (and networks) and describes the properties of graphs as well as their applications. We will not be covering the entire field of Graph Theory in this article, but we will look at the highlights relevant to the spread of information.
The spread of information can be modeled with a flow graph, or in our case a flow network. A flow network is a type of weighted directed graph that shows the transfer or transport of something within a structure. This could be water through pipes, data through a computer network, or, in our case, information in a social network.
A flow Network has all the components of any network (nodes, edges, weights), but now they are specific representations of the flow of something in a given structure. In our case, nodes are __ individuals in a social network who are connected to other individuals with _edge_s that are _weigh_ted by the degree of contact they have. Flow networks can also be built with a source, or the origin of the information, along with a sink, where the flow of information terminates and a capacity or maximum amount of information that can go through the network.
Figure 4 is a social flow network that represents a classroom. Each student (and the teacher) is a node in this network that is connected to others in the classroom with directed edges showing who they have contact with. In this case, the edges are not weighted but the nodes are sized by their centrality.
There are many measures of graph/network centrality but overall centrality gives the importance of a node inside of a network. A few examples are degree centrality which is defined as the number of edges connected to a node and closeness centrality which is the number of short paths a node has to all other nodes. In the case of figure 4, we see nodes sized be degree centrality, with the teacher (SL) being the most central node in the network with the highest degree centrality.
With a basic understanding of graphs and networks, we can go over the next piece we need; information diffusion and social contagion.
Information Diffusion and Social Contagion
Information diffusion refers to the spreading of information through a population or system. For a population, information spreads by members interacting in various ways such as in person, online, in writing, through speech or recording. Information diffusion is use to describe the spread of rumors, effectiveness of an advertising campaign, and any other of information through a population.
We can’t talk about information diffusion without also talking about social contagion. It looks at how information is spread through a population in the information diffusion process and the factors that influence it. Social contagion factors in social connections, media coverage and cultural norms among other things to study how fast or how contagious information is, how much of a population information spreads to, and the resistance members of a population have to new information. Social contagion can be used to study phenomenon like the rapid spread of a viral video, propagation of conspiracy theories, and adoption of new trends.
Diffusion and Contagion Models
Information diffusion and social contagion are both well researched topics that have theoretical models developed for them. We will first look at a simple information diffusion model known as the two-step flow model.
The two-step flow model starts with mass media of some kind such as a news network or a large company putting out an idea that is then picked up by opinion leaders/highly influential members of a community. Then, members of the community will adopted the idea from opinion leaders they are closest to. This two step process of media to opinion leader then leader to their community assumes most people don’t adopt ideas straight from mass media; they need a source that is closer to them to share the idea for them to adopt it.
For example, say there’s a new phone releases and there are advertisements for it everywhere. The general population will likely be jaded to the advertising or overwhelmed with advertisements from competing phone companies and it will not convince them to purchase the new phone. Now, if they hear about the new phone from a closer source, say a favorite celebrity, influencer, coworker, or family member, that will be much more persuasive. We can see a visualization of this two-step flow process in figure 6.
Before diving into Simulation, we will look at a model for social contagion. Social contagion treats the spread of information like a disease (hence "contagion") that spreads through contact with the information. Thus, the basic social contagion model draws from disease modeling and resembles a SIR curve like we saw previously when we modeled disease spread.
This curve can be seen in figure 7 which is a line modeling of what we saw in figure 5. First, an idea is held by a individual or small group who spread it to the people closest to them. Then those people spread it to the people closest to them spreading the information outward into the population. This iterative process will spread the idea exponentially at first until it hits a critical mass after going mainstream which will then spread the idea linearly. This growth will start to taper off once it reaches everyone and runs out of people to spread through. This is the same behavior we see with diseases that spread through contact, so it makes sense that information spreads in a similar manner.
We can now combine our understanding of networks and how ideas spread to create a computational framework to simulate the spread of information and ideas.
Simulation
For this simulation we will be using a python class based approach similar to the one we created in our agent-based disease model. We will also be using the NetworkX library to build networks for us. We first start by creating a network class and adding some initial methods. We need a method to create an initial source of information in our network as well as add new nodes (people) and directed edges (connections). The code to do this is given below and is visualized in figure 8.
class Network:
def create_source(self):
G = nx.DiGraph()
self.G = G
self.G.add_node(0)
def add_node(self):
index = len(self.G.nodes)
self.G.add_node(index)
def add_connection(self,node1,node2):
self.G.add_edge(node1,node2)
We can now initialize a source on information, add more people to our network and add connections between the source and the people in the network as well as connect people to other people. We need to add more functionality to our network class before we can run our full simulation though. We need a way to keep track of what people in the network the initial idea has reached and have a mechanism that propagates said idea.
This is done by adding features to our nodes, which is easily handled by the functionality of NetworkX. We give each person or node in our graph the feature "informed" that is a boolean value that tells us if the idea has reached them or not. Spreading an idea through our network will set more and more nodes to having the value 1 (true) for their informed feature.
With the informed feature added to our nodes we can also create a method to propagate and idea through our network. We do this by going through the list of edges in our network (provided by the NetworkX object) and seeing which nodes are connected to each other. If an informed node is connected to an uninformed node (informed feature value is 0 or false) then it will propagate its idea to the uninformed node; setting its informed feature value to 1.
To make this simulation more realistic, we will make the propagation mechanism probabilistic because not all interactions between people share ideas or the specific idea propagating through the graph. We will do this by weighting the edge values (giving the connections between nodes a strength value) and then rolling a random number against this edge value to see if we propagate the idea between the two nodes or not. For this simulation we will weight the edges by the degree centrality of the node they are coming out of, so more central, well connected nodes in the network will have a higher probability of propagating the idea.
Additionally this random roll against the edge value will be governed by a resistance value. The resistance value is something we see in social contagion theory that determines how resistant a person is to accepting or spreading a new idea. The lower the resistance value of an individual in our network, the more likely they are to roll favorably against the edge value and become informed.
Adding the informed feature and the propagation method is shown in the code below as well as visualized in figure 8.
class Network:
def create_source(self):
G = nx.DiGraph()
self.G = G
self.G.add_node(0)
self.G.nodes[0]["Informed"] = 1
def add_node(self):
index = len(self.G.nodes)
self.G.add_node(index)
self.G.nodes[index]["Informed"] = 0
def add_connection(self,node1,node2):
self.G.add_edge(node1,node2)
self.G[node1][node2]["Connection"] = nx.degree_centrality(self.G)[node1]
def propogate_information(self, resistance):
for edge in list(self.G.edges):
rand = np.random.uniform(0,resistance)
if self.G.nodes[edge[0]]["Informed"] == 1 and rand < self.G[edge[0]][edge[1]]["Connection"]:
self.G.nodes[edge[1]]["Informed"] = 1
With the methods we have constructed we can create a Social Network. Our network starts with a source of information and then we add 100 nodes (people to the network). Next we add in connections between people by randomly selecting 2 nodes out of the network and creating a directed edge between them. This is done 300 times, meaning on average every person is connected to 3 other people. Figure 9 shows the result of doing this with our initial social network where the idea is contained at the source.
def get_color(network):
color_dict = dict({0:"red",1:"green"})
color = list(dict(network.G.nodes(data="Informed")).values())
color = [color_dict[i] for i in color]
return color
network = Network()
network.create_source()
for i in range(0,100):
network.add_node()
nodes = list(network.G.nodes)
for i in range(0,300):
if i == 0:
node1 = nodes[0]
else:
node1 = np.random.choice(nodes)
node2 = np.random.choice(nodes)
if node1 != node2:
network.add_connection(node1,node2)
pos = nx.kamada_kawai_layout(network.G)
color = get_color(network)
plt.figure(figsize =(10,10))
nx.draw(network.G, node_color = color, arrowsize=20, pos = pos)
Now it is time to propagate the idea from our source through the rest of the network. We will simply add our propagation method into a loop and let it spread and idea through our network for a given number of time steps. Here we will propagate the idea for 50 time steps and give everyone a uniform resistance value of 0.3.
An animation of this spread is shown in figure 10 and the final network in figure 11.
informed = []
networks = [network.G.copy()]
for i in range(0,50):
network.propogate_information(.3)
informed.append(sum(list(dict(network.G.nodes(data="Informed")).values())))
networks.append(network.G.copy())
Through our simulation, we can keep track of how the idea spread through out network by saving the number of informed people per time step. This can be seen in figure 12, which very closes resembles the theoretical curve we would expect from social contagion theory in figure 7. We see the classic S-shaped curve where the idea picks up exponentially, has a linear phase, and then flattens out at the end as our network becomes saturated with the idea. Looking closely at our network, there are a few nodes that have no edges pointing to them, only pointing out of them which means they flow information into the network but the reverse is not true. These nodes will never become informed so our total population will never fully be exposed to the source idea.
Conclusion
In this article we learned about graphs and networks as mathematical structures that represent relations between objects. We also learned about the spread of ideas by going over information diffusion and social contagion theory. We then combined our understanding of networks and the spread of ideas to simulate the spread of an idea through a social network.
Full Code
#Import Libraries
import networkx as nx
import matplotlib.pyplot as plt
import numpy as np
import imageio
#Make helper functions
def makeGif(networks, name):
!mkdir frames
counter=0
images = []
for i in range(0,len(networks)):
plt.figure(figsize = (8,8))
color = get_color(networks[i])
nx.draw(networks[i], node_color = color, arrowsize=20, pos = pos)
plt.savefig("frames/" + str(counter)+ ".png")
images.append(imageio.imread("frames/" + str(counter)+ ".png"))
counter += 1
plt.close()
imageio.mimsave(name, images)
!rm -r frames
def get_color(graph):
color_dict = dict({0:"red",1:"green"})
color = list(dict(graph.nodes(data="Informed")).values())
color = [color_dict[i] for i in color]
return color
#Create Network Class
class Network:
def create_source(self):
G = nx.DiGraph()
self.G = G
self.G.add_node(0)
self.G.nodes[0]["Informed"] = 1
def add_node(self):
index = len(self.G.nodes)
self.G.add_node(index)
self.G.nodes[index]["Informed"] = 0
def add_connection(self,node1,node2):
self.G.add_edge(node1,node2)
self.G[node1][node2]["Connection"] = nx.degree_centrality(self.G)[node1]
def propogate_information(self, resistance):
for edge in list(self.G.edges):
rand = np.random.uniform(0,resistance)
if self.G.nodes[edge[0]]["Informed"] == 1 and rand < self.G[edge[0]][edge[1]]["Connection"]:
self.G.nodes[edge[1]]["Informed"] = 1
#Run simulation
network = Network()
network.create_source()
for i in range(0,100):
network.add_node()
nodes = list(network.G.nodes)
for i in range(0,300):
if i == 0:
node1 = nodes[0]
else:
node1 = np.random.choice(nodes)
node2 = np.random.choice(nodes)
if node1 != node2:
network.add_connection(node1,node2)
pos = nx.kamada_kawai_layout(network.G)
color = get_color(network.G)
#Plot initial netowork
plt.figure(figsize =(10,10))
nx.draw(network.G, node_color = color, arrowsize=20, pos = pos)
#Propogate idea
informed = []
networks = [network.G.copy()]
for i in range(0,50):
network.propogate_information(.3)
informed.append(sum(list(dict(network.G.nodes(data="Informed")).values())))
networks.append(network.G.copy())
#Plot final network
plt.figure(figsize =(10,10))
color = get_color(network.G)
nx.draw(network.G, node_color = color, arrowsize=20, pos = pos)
#Plot contagion curve
plt.figure()
t = np.arange(0,len(informed),1)
plt.plot(t,informed)
plt.xlabel("Time")
plt.ylabel("Informed Members")
plt.title("Information Contagion Curve")
plt.savefig("contagionCurve.png")
#Save gif
makeGif(networks, "contagion.gif")
References
All images used in this article were created by the author, taken from the author’s own work, or fall under the creative commons license.
Nguyen, Le, "A Graph-Based Approach to Studying the Spread of Radical Online Sentiment" (2023). Thesis. Rochester Institute of Technology. Accessed from https://scholarworks.rit.edu/theses/11453