The Isolated Den of Fox News

Mapping a network of online news outlet citations

Spencer Dean
Towards Data Science

--

Interactive visualization of citation networks (https://srdean.shinyapps.io/Final/)

“Ukraine says Russia opened fire on its naval vessels, seized them …G.M. to Idle Plants and Cut Thousands of Jobs as Sales Slow … Scientists in China Claim First Gene-Edited Babies … ”

These are several headlines that graced the front pages of the internet during the last week of November, 2018. Focusing on 16 of the most popular news sites, I collected data from 480 articles during this one-week period (5 pieces for U.S. News, World News, Politics, Opinion, Business, and Entertainment for each source).

Fig. 1: Sources & popularity (millions of unique monthly visitors)

For each article, I recorded when links were made to other online news sites. Links could either be hyperlinks or in-text references and citations*. In other words, whenever there was a flow of information from one source to another, I aimed to capture it.

*This edge construction operated under some of the same assumptions as Google’s PageRank, which presumes that websites receiving more links from other websites are likely more important.

Due to the robust nature of my dataset, I decided to create an interactive app which allows users to subset the network into certain categories. It is available here: https://srdean.shinyapps.io/Final/.

The complete network below represents the 312 unique outlets that were cited in my database plus two points representing NewsMax and the Blaze, which were the only original sources not to appear in another source’s citations list. Points are sized according to “in-degree centrality” (i.e. websites which were linked by more articles appear larger).

Fig. 2: Complete network with the original 16 sources labeled and points sized according to in-degree centrality.

You’ll notice that several of the largest points in the network are unlabelled. The two sources with the most incoming citations were the Associated Press (AP) and Reuters.** Neither of these showed up on my list of the most frequented U.S. news outlets and therefore were not included in the initial analysis.

**123 articles (roughly a quarter of all in the database) linked to AP and 60 to Reuters

Apart from these two sources, though, the number of incoming citations seems to be correlated with popularity. For example, CNN and the New York Times (NYT) appear relatively large in the network and have the two highest popularity scores. I investigated this relationship and found a statistically significant (t < .01) positive correlation between number of incoming citations and popularity.

Fig. 3: Correlation between popularity (x-axis) and number of incoming citations (y-axis)

Notably, though, Fox News does not fit well with this regression line. Despite being fourth in popularity with 78 million unique monthly visitors, Fox News was only cited by three stories in the 480 story database. This was the fourth lowest ranking out of the 16 sources, behind even Vox, which had roughly one third the popularity but more than twice as many incoming citations. Perhaps these results would be different if more conservative news outlets were included in the analysis. However, liberal sources were only selected due to their higher popularity scores.

These results support findings by a Pew Research Center study which stated that liberals consistently name an array of main news sources (CNN, NYT, NBC, and NPR) while conservatives are more clustered around just one, Fox News. It would therefore appear that there are many popular liberal-leaning news outlets and mainly one popular source that leans right.

The next network I created contained only edges which were between sources in the original list (fig. 1). Using the cluster_walktrap algorithm from igraph R, I identified and highlighted communities within this network. This function attempts to find densely connected subgraphs within a graph. The results are shown in fig. 4, with points colored according to political bias*** (rating by Allsides.com).

***Darker red represents more conservative bias, magenta denotes the lone centrist source, USA Today, and darker blue shows heavier liberal bias.

Fig. 4: A network of hyperlinks and in-text citations from and to online news sources. Points are sized according to number of incoming citations and colored according to political leaning (Allsides.com)

This network begins to depict the outsider status of Fox News. Out of two communities, Fox clearly occupies the less central one and appears relatively small due to few incoming links. The larger community is evidently very blue, but insights from this observation are limited due to the fact that most sources in the network were simply more liberal.

In the remainder of my analysis, I focused on how the network might change depending on article category. Fig. 4 exhibits one of the six networks I constructed, for articles in Politics sections. Communities were again defined using the cluster_walktrap function.

Fig. 4: Network for articles in Politics sections

Community formation for separate sections was very irregular; however, I was able to draw some conclusions from my results. The most common “neighbors” — sources which were found in the same community — were NYT and the Washington Post, appearing together in five out of the six section communities. Anytime a source appeared in the majority of networks as a neighbor of another source (at least four out of six) they shared the same side of the political spectrum according to the Allsides rating. Fox news never appeared in a community with Buzzfeed, HuffPost, the Washington Post, NYT, NBC, or The Guardian.

Within sociology and social psychology, implicit and confirmation biases are well-observed and thoroughly discussed phenomena. The first defines our natural tendency to group people into categories and form conceptions about trust and mistrust based “us vs. them.” The latter, confirmation bias, refers to our propensity to seek out information that confirms what we already know or believe.

Large-scale studies have found that these biases leak deeply into our consumption of media, contributing to political polarization. Pew Research Center describes that, “When it comes to getting news about politics and government, liberals and conservatives inhabit different worlds. There is little overlap in the news sources they turn to and trust.”

Less well-explored, I believe, is the notion of how these biases may show up in the actual sources of journalism. For this research project, I sought to determine the network characteristics of news website citations. Along the way, I aimed to answer the question of whether news sources, like their consumers, “inhabit different worlds” as well as to determine which websites play the biggest role in the distribution of information across the network.

Network analysis for this project led to findings that suggest strong ties exist between certain sources of online journalism that mediate the flow of information. The prevalence of a given outlet in this network may be predicted somewhat accurately simply by the website’s popularity among readers, but a caveat appears to exist regarding the outlet’s political bias. While more research would be needed to support this observation, it seems that sources are more likely to cite and be cited by sites which share their politics.

--

--