Hands-on Tutorials

Take it to Twitter: Social Media Analysis of Members of Congress

Understanding Views of Congress on Social and Policy Issues through Sentiment Analysis in R

Blake Robert Mills

Published in

Towards Data Science

21 min readAug 7, 2021

Image by Author, Portraits from Government Printing Office

Twitter and American Politics

Since the Trump administration took office, Twitter usage has been discussed ad nauseam in American politics. While Former President Trump has since left office, the importance of the social media app has remained constant. Twitter has now become a primary way for congressional members to engage with their constituents and express their opinions on policy issues.

With the new Biden Administration taking office, congressional bipartisanship has been at the forefront of every major issue in politics. With tight margins in both the House and the Senate, the need for members of both parties to support legislation is crucial for effective government. While both sides talk about the need for bipartisan support, in a time of hyper-polarization, do they reflect this need in their social media?

To examine this question, I collected tweets from sitting members of congress. Any and all accounts owned and (appeared to be) operated by a congress member were scraped for tweets. While many politicians have teams running their social media, the specific goal of this analysis was to look at tweets from accounts where the source appeared to the congress member themselves.

To examine this question, I collected tweets from sitting members of congress. Any and all accounts owned and operated by a congress member were scrapped for tweets. While many politicians have teams running their social media, the specific goal of this analysis was to look at tweets from accounts where the source appeared to be the congress members themselves.

For this specific project, the main areas of interest were:

How are members of congress tweeting about each other (both members of their own party and the opposing party)?
Are they tweeting about social and policy issues in a positive or negative way?
What keywords are most often used for social and policy issues? And how do these keywords differ by party?

Tweet Collection and Pre-Processing

In order to conduct this analysis, Twitter accounts of members of Congress had to be determined. The Triage Cancer organization has a spreadsheet located on their website with the main handles of all congressional members (link to sheet here). While this served as a solid base, I noticed that there were members on the list that were no longer serving. One limitation of the list only contained was that it only named the governmental accounts of congressional members; however, many have other accounts, which are often more frequently used. Thus, I manually looked each member up on Twitter to confirm the congressional account was accurate and discover any other profiles owned by current congressional members. I sorted these additional accounts into three categories: personal, campaign, and press office accounts. Personal accounts were usually stated as such in their bio and were often more active than a congress member’s congressional account. Campaign accounts were also collected if and only if the tweets presented as though they were originating from the congress member. Any account with a disclosure that it was run by a congress member’s team was excluded.. Finally, press office accounts were collected under a similar condition as a campaign account. If the bio of the account stated that tweets did not originate from the congress member themself, the account was not included. The purpose of collecting accounts this way was to analyze only the tweets that originate from the congress member.

Of all 535 members of congress, 532 were found to have at least one Twitter account. Former Rep. Alcee Hastings (D-FL), who passed away in April of 2021, did not have a Twitter account. His seat is currently vacant as well; thus, no tweets from Florida’s 23rd District could be analyzed. Rep. Chris Smith (R-NJ) was also found to have no Twitter accounts. The last member not analyzed was Rep. Jeff Van Drew (R-NJ). Rep. Van Drew possessed a Twitter account; however, it was deleted in February of 2021. It did not feel appropriate to include whatever tweets could be collected, as the timespan of his tweets would not be the same as all other accounts.

There were also two other accounts in which tweets could not be collected. Rep. Doug LaMalfa (R-CA) and Rep. Zoe Lofgren (D-CA) had private personal accounts not included in this analysis. Given that these are not accessible to a general audience, it did not make sense to include them, as the purpose of this analysis was to see how congressional members present themselves publicly.

In the end, 1,016 accounts linked to congressional members were identified. This included 517 governmental accounts, 320 personal accounts, 166 campaign accounts, and 17 press office accounts.

Tweets were collected between the period of November 3rd, 2020, to July 25th, 2021. The first collection of tweets were pulled on April 25th, and additional tweets were added on June 15th and July 25th. Because additional tweets were appended to an existing data frame, it is possible that deleted tweets could be included in the data frame. Tweets were collected using a Twitter API and the get_timeline function from the rtweet package in R. Due to restriction of the Twitter API, only the most recent 3,200 tweets from any account could be pulled. There were no congress members that had tweeted more than 3,200 times since November 3rd, 2020, when the initial collection was executed. When additional tweets were appended to the previous collection, duplicates were checked based on the tweet’s content, the screen name of the tweeter, and the time at which it was created. This allowed for new tweets to be kept and any duplicated to be removed. Example code for collecting tweets and checking for duplicates is provided below in the embedded code where “TwitterAccounts” is a data frame with congressional accounts and “Handle” is the column with each Twitter screen name.

library(rtweet)Tweets <- vector()
for (i in TwitterAccounts$Handle){
  df <- get_timeline(i, n=3200) #Collects the most recent 3,200 Tweets from each handle
  Tweets <- rbind(Tweets, df) #Binds into one large df 
}#Used when appending additional tweets to existing dfNewTwitterDf <- rbind(OldTwitterDf, Tweets)
NewTwitterDf <- NewTwitterDf %>% mutate(screen_name= tolower(screen_name)) %>% #Standardizes handles to all be lowercase
  distinct(text, screen_name, created_at, .keep_all = TRUE) #Checks for dups based on tweet, user, and time tweeted

While 1,016 accounts relating to sitting members of congress were identified, only 976 accounts were found to have had tweets between November 3rd, 2020, and July 25th, 2021. Most of the accounts dropped were campaign accounts that were not active after November 3rd. From these accounts, 541,689 tweets in the specified time frame were collected. These served as the tweets used in the analyses.

Twitter Profiles and Behavior of Congressional Members

Before any text was processed, simple measures of the following and engagement of congressional members were examined. In the figure below, the follower count (as of July 25th, 2021) of the 976 accounts was analyzed and broken up by state. Of all accounts, only 35 have garnered more than a million followers, with the most popular accounts people identified by their handle on Figure 1 below.

Figure 1. Total Followers of Congress Members by Accounts. Image by Author

It is seen that a majority of the highly followed accounts (more than 1,000,000 followers) are owned by Democrats (20 of 35). This is reflective of overall trends as well. The mean number of followers for a Republican congressional member’s account is 106,521 with a median of 15,024, while the mean number of followers for Democrats or Independents (grouped together as the two Independent senators caucus with the Democrats) is 253,327 followers with a median of 31,570. However, if the follower counts of Independents (mostly driven by Sen. Bernie Sanders) are dropped from the Democrat’s counts, the mean followers fall to 199,693 followers with a median of 31,496. Regardless, the overall trend appears that Democrats are more followed than Republicans.

It is also important to gather engagement of congressional members’ tweets. This was calculated using the mean number of favorites and retweets each account gets per tweet. Only original tweets were used for these calculations, meaning that retweets were not utilized to measure the congressional members’ engagement accurately. Once again, it appears Democrats have a slight edge over Republicans, with Democrats receiving a mean number of 1,171 favorites per tweetand Republicans receiving 627 favorites per tweet. These figures are likely inflated due to certain members receiving lots of engagement, as the mean number of favorites for Democrats at 94 favorites per tweet and 58 favorites per tweets for Republicans.

Figure 2. Engagement of Original Tweets of Congress Members by Account. Image by Author, Portraits from Government Printing Office

That all said, it appears the most accounts seem to receive retweets proportional to their number of favorites, with the vast majority of accounts receiving less than 500 favorites per tweet. Notably, of all tweets from congressional members since November 3, 2020, only three have received more than one million likes, with all tweets originating from Congresswoman Alexandria Ocasio-Cortez’s personal account (@AOC).

Text Processing

Next, the content of the tweets from these accounts was analyzed. Pre-processing was conducted on all tweets to make them able to be analyzed. First, all emojis were converted to text. This was done by using the “Emoji’s” data frame from Unicode.org in the rtweet package. The data frame contains two columns, one with each emoji and one with the text equivalent of the emoji. To make things interpretable when the tweets were tokenized, spaces in the emoji description were replaced with a hyphen, so they were read as a single token, and the word “emoji” was appended to each description. For example, “😀” was converted to “grinning-face-emoji.” After this, a simple loop checked tweets for each emoji and replaced it with its text equivalent. Example code is provided below.

library(rtweet)Emojis <- emojis 
Emojis$description <- str_replace_all(Emojis$description, " ", "-") %>% paste(" ", .,"emoji ", sep="-")for(i in 2284:2623){Election$text <- str_replace_all(Election$text,
                as.character(Emojis[i,1]), #Identify Emoji
                as.character(Emojis[i, 2]))} #Replace with word

Next, hashtags were cleaned. To do this, the “#” character was replaced with the word “hashtag” to help identity which strings were part of hashtags. Following this, all punctuation and capitalization were removed from the document. All numeric values were also removed from every tweet; however, before this occured, the numbers “45” and “46” were converted to “fortyfive” and “fortysix” so they would not be removed. These were chosen, as they usually carry an important semantic value in American politics: 45th President Trump and 46th President Biden.

Before any other processing was done, a frame was created. This frame was used to run sentiment analysis (described later). For this to occur, however, a frame was needed that did not have any further text processing, as the words in the tweets needed to remain unchanged for the sentiment library to work.

The document was also stemmed using the tm library in R. This allowed for tense deviations (vote, votes, voted, voting) and plurals (ballot, ballots) to be read as the same. One issue that arose from this is that the words “policy,” “policies,” and “police” were all read as the same character (“polici” after stemming). Given that these words have different meanings, especially in contemporary politics, a distinction must be made. Thus, prior to stemming, all instances of the words “policy” or “policies” were changed to “policygov” and “police” and other forms of the word were left unchanged to be stemmed. This allowed for the difference between the two words to be kept.

After this was complete, stop words, words frequently used that carry little to no semantic meaning individually, were removed. While the tm library in R contains a stop word library, it includes words like “states.” When used in terms of the verb, this word has little significance; however, in the realm of United States politics, this word needs to be kept. To prevent the exclusion of politically important words, a modified list of stop words was removed.

Finally, all extra whitespace in tweets were removed using the stripWhitespace function in the tm package. The final result was two data frames, one capable of running sentiment analysis after having tweets cleaned, and one used for tokenization for a bag of words model after stemming and the removal of stop words was complete. Code for how all of this cleaning was done is provided below.

Twitter$text <- tolower(Twitter$text) #Removes capitalization 
Twitter$text <- removePunctuation(Twitter$text) #Removes punctuation 
Twitter$text <- str_replace_all(Twitter$text, "policy|policies", "policygov") #Chnages policy (Document) to "policygov"
Twitter$text <- str_replace_all(Twitter$text, "45", "fortyfive") #Changes number 45 to words
Twitter$text <- str_replace_all(Twitter$text, "46", "fortysix") #Changes number 46 to words
Twitter$text <- gsub("[[:digit:]]+", "", Twitter$text) #Removes all numbersTwitterSentiment <- Twitter #Creates a data frame for sentiment analysisTwitter$text <- stemDocument(Twitter$text) #Stems document
Twitter$text <- removeWords(Twitter$text, StopWords) #Removes all stopwords from a custom vector
Twitter$text <- stripWhitespace(Twitter$text) #Collapses all white space in a tweet

Sentiment Analysis

One area of interest is the sentiment of congressional members’ tweets. This is the process of taking a piece of text and analyzing the words used to determine emotion or assign a numeric value in order to rate it on its positivity or negativity.

Using the sentiment frame described above, tweets were assigned a unique ID number so that tokenization could occur and each tweet could still be identified later. Tweets were then split into individual tokens using the unnest_token function in the tidytext package. This resulted in a data frame where each row contained the original tweet ID number given and each word.

Next, the afinn sentiment library was loaded from the tidytext package as well. This creates a data frame with columns: one with sentimentally charged words and a corresponding sentiment. The scale of sentiments ranges from -5 (very negative) to +5 (very positive). This library was then joined to each token. If a word is included in a tweet but not included in the afinn library, the sentiment was labeled as NA. Following this, each tweet was aggregated so that the average sentiment of all words in the tweet was calculated. If all words in a tweet were NA, the sentiment of the tweet was coded as 0 (neutral sentiment). An example of this process using a tweet by Rep. Val Demings (D-FL), as well as the code for this process, is presented below.

Figure 3. Example of Extracting Sentiment of a Tweet. Image by Author

library(tidytext)TwitterSentiment <- TwitterSentiment %>% mutate(TweetID=1:nrow(TwitterSentiment)) #unique ID for each tweetafinn <- TwitterSentiment %>% unnest_tokens(word, tweet) %>% #extracts each word in a tweet to its own row
  left_join(., get_sentiments("afinn")) %>% #Attaches sentiment from afinn library to each word
  select(-word) #removes word columnafinnTweets <- aggregate(value ~ TweetID, data=afinn, FUN=mean) %>% #Averages sentiment for all words in a tweet
  left_join(TwitterSentiment, ., by="TweetID") %>% #Adds sentiment to original df with complete tweet and data
  mutate(afinn = ifelse(is.na(value)==TRUE, 0, value)) #Changes tweets with NA sentiment to 0

As seen in the code, the NAs were filled with 0 when all words in a tweet were NA. Words that were not sentimentally charged in the afinn library were not replaced with a sentiment of 0, as this would have neutralized long tweets. For instance, in the tweet from Rep. Val Demings above, if NAs were filled with zeros prior to aggregating the mean, the tweet would have a sentiment of +0.45. However, suppose she tweeted, “Congratulations Tampa Bay Buccaneers and thank you to our amazing health care workers!” This tweet would have a sentiment of +0.77, as it has fewer neutral words. Despite the fact that the two phrases are semantically almost identical, they have quite different sentiments. Thus, NAs were not filled with 0s unless all words were NA after joining the afinn library.

Congressional Sentiment

Once the sentiment of all tweets were determined, the average sentiment for every member of congress across all their accounts was determined. This showed how positive (or negative) congressional members were presenting themselves publicly. From this analysis, it was found that the average sentiment for all members of congress with Twitter accounts was +0.56, meaning that, as a whole, the average congress member is tweeting slightly more positive statements. In fact, of the 532 members with Twitter accounts, only 25 had an average negative sentiment. The average sentiment of all congressional members is presented in the map below, with members with the highest and lowest sentiment and other frequently mentioned congressional members broken out. Further, the sentiment rank (starting at 1 for most negative and ending with 532 for most negative) is labeled. Additionally, the three members with no Twitter accounts are presented as gray hexagons in their respective states. Each hexagon is marked with the state and either the congressional district they represent or “Sen” for senators. The acronym “MAL” stands for “Member-At-Large” and is used when applicable.

Figure 4. Average Tweet Sentiment by Congressional Member. Image by Author, Portraits from Government Printing Office

One interesting thing to note is that some of the most talked about politicians have the lowest sentiments. Almost all of the frequently talked about congress members fall into the bottom 100 when ranked on sentiment. This could be due to them using their platform to criticize or speak negatively about policy proposals, current events, or other members to promote their agendas to their large audience; however, no formal analysis was done to test this.

Interactions through Mentions

Another area of interest is how politicians interact with each other. To measure this, the sentiment of each politician was gathered and aggregated by each party. First, using the mentions_screen_name variable provided in the Twitter dataset, any tweets mentioning the 976 accounts of congress members were pulled. Using the tweet sentiment method discussed above, the average sentiment of each politician mentioning any of their accounts was determined and broken apart by party. This means that the average sentiment of Democrats and Republicans could be determined for every member of congress. The results, along with the number of times each member of congress is mentioned by a party, is presented in Figures 5 and 6 below. Once again, for analysis purposes, both Independent senators were treated as Democrats but are presented as Independents in the graphs.

Figure 5. Average Sentiment of Members of Congress based on Republican Tweets. Image by Author, Portraits from Government Printing Office

Figure 6. Average Sentiment of Members of Congress based on Democratic Tweets. Image by Author, Portraits from Government Printing Office

Interestingly, the graphs are strikingly similar, with almost all politicians, regardless of party, clustered above 0, meaning that, for the most part, all members are being mentioned in a positive light. Those below 0, for both parties, are almost exclusively members of the other party. However, what is perhaps the most interesting is that members of the opposite party are heavily clustered around 0 on the x-axis. This means that both parties are hardly mentioning the other in their tweet. This has two large implications. First, average sentiment for many members is determined by one or two tweets. Thus, extremities are heightened and should be taken with a grain of salt. Second, in an era where bipartisanship has been at the center of almost every piece of legislation, politicians tend to stick to their own members when it comes to talking about them, at least on Twitter. While it is positive to see that, for the most part, politicians are not tweeting about opposing party members negatively, in order to foster cooperation and give the appearance of bipartisanship to their constituents and followers, more interactions between parties need to occur.

One important caveat that should be noted for this and further sentiment analysis conducted is that some tweets may appear to be negative according to the afinn library, but contextually are not. This is exemplified in the case of Rep. Grace Meng (D-NY). She is one of the most mentioned politicians from tweets by Democrats (mentioned 440 times) and has a slight negative sentiment by Democrats as well (sentiment of -0.15). This is not because she is viewed as unfavorable, but rather because she is the sponsor of H.R. 1843, the COVID-19 Hate Crimes Act. When mentioning her, many congressional members have also mentioned her bill, and the afinn library is attaching a negative sentiment to the word “hate.” While this is an important caveat to note for particular examples, the analysis of tweets as a whole are more reflective of the afinn library’s determined sentiment rather than exceptions such as these.

Sentiments about Issues

While congressional members may not talk about each other that often, one area they overlap is the discussion of social and policy issues. With sentiments of all tweets determined, how members discuss contemporary issues could be examined.

Four contentious contemporary issues were randomly selected to examine: Black Lives Matter, immigration, infrastructure, and the minimum wage. Throughout the 2020 campaigns and continuing into legislation of the 117th congress, these issues have been discussed frequently, thus producing many tweets to allow for a thorough analysis.

Using only tweets that mentioned these issues, the average sentiment of all congressional members (if they had any tweets about said issue) was calculated. Their sentiments, as well as their frequency of tweeting about each issue are presented in Figures 7, 8, 9, and 10 below.

Figure 7. Frequency and Average Sentiment of Tweets Mentioning Black Lives Matter by Congress Member. Image by Author, Portraits from Government Printing Office

Figure 8. Frequency and Average Sentiment of Tweets Mentioning Immigration by Congress Member. Image by Author, Portraits from Government Printing Office

Figure 9. Frequency and Average Sentiment of Tweets Mentioning Infrastructure by Congress Member. Image by Author, Portraits from Government Printing Office

Figure 10. Frequency and Average Sentiment of Tweets Mentioning Minimum Wage by Congress Member. Image by Author, Portraits from Government Printing Office

From the graphs, it is evident that for almost every issue, there is a clear divide between the two parties in terms of sentiment. Even in the case of infrastructure, nearly all tweets are slightly positive in sentiment, and it appears Democrats are still more positive as a whole than Republicans. In terms of frequency of tweets, it appears that both parties have similar trends for the issues of immigration and infrastructure, with the majority of their party mentioning the issue a few times and a few members tweeting about it frequently. For the issue of minimum wage, it is seen that almost all of the top tweeters are Democrats.

As for Black Lives Matter, it can be seen that the graph is overwhelmingly filled with blue dots compared to the reds. This signifies, as a whole, more Democrats have mentioned Black Lives Matter in at least one tweet compared to Republicans. Interestingly, the most frequent tweeter for the issue is Rep. Majorie Taylor Greene (R-GA). In fact, she makes up over 25% of all Republican tweets about Black Lives Matter. Given her outspokenness on this matter, this comes as little surprise; however, it can be noted that this is the only issue of the four selected where the top tweeter has a negative sentiment. It should be noted that tweets about Black Lives Matter naturally skew negative, as events surrounding it (such as murder) will negatively affect the afinn library. Thus, negative sentiment may not reflect a politician’s view of the issue but rather what events they are choosing to focus on.

While there appears to be a difference between the parties for the issues, tests should be conducted to test for this difference. Thus, a two-tailed Student’s t-test was performed on the difference between each party’s mean sentiment for both issues. Additionally, to check the effect size of these differences, the Cohen’s d was calculated. The results from all these tests are presented in the table below.

Table 1. Results of Statistical Analysis Between Party Tweet Sentiment Difference of Issues. Image by Author

There was a highly statically significant difference between the mean sentiment of tweets for Democrats for all four issues compared to the mean sentiment of tweets for Republicans. For all the issues, but infrastructure, the average sentiment for Democrats was positive and the average sentiment for Republicans was negative. The issues of infrastructure and Black Lives Matter are seen to have a small effect size based on the Cohen’s d estimate. Thus, while there is a difference between the average sentiment of the respective issues for each party, the size of this effect is relatively small; however, the issues of minimum wage and infrastructure produced medium and large effect sizes, respectively, according to the estimated Cohen’s d. Thus, based on tweets about them, these policy issues see the largest partisan divide. What is striking is the especially high Cohen’s d estimate of immigration. At 0.93, this large effect size signals that this is the policy area of all the issues chosen when Democrats and Republicans have the biggest discrepancy of viewpoints based on the sentiment of their tweets.

Talking about Issues

Given that there is a clear difference between the way Democrats and Republicans tweet about policy issues, it will be interesting to see what keywords are most frequently used in relation to each issue. To do this, the Twitter data frame with removed stop words and stemmed was used. Once again, any tweet mentioning one of the four social or policy issues decided was separated. From there, using the unnest_token function part of the tidytext package, the frequency of all unigram (one-word tokens) and bigrams (two-word tokens) in a tweet were calculated. The frequency of these was then summed for each party to gather the total number of times a unigram or bigram is used for a party in a tweet about a specific policy or social issue. Once again, Independent tweets were categorized under Democrats.

After this, the top 25 unigram or bigrams for an issue by party was found. The following four figures display these most common tokens in tweets by Democrats and Republicans.

Figure 11. Most Frequent Words and Phrases in Tweets about Black Lives Matter by Party. Image by Author

Figure 12. Most Frequent Words and Phrases in Tweets about Immigration by Party. Image by Author

Figure 13. Most Frequent Words and Phrases in Tweets about Infrastructure by Party. Image by Author

Figure 14. Most Frequent Words and Phrases in Tweets about Minimum Wage by Party. Image by Author

Not only do these word clouds reveal what each party associates with an issue, but it also helps explain the trends in their sentiments. One of the most exciting areas is the difference in parties in the issue of Black Lives Matter. As seen in the Cohen’s d test, the effect size of the difference of sentiment was seen as small. However, this may be because both parties are tweeting about the issue negatively, but in different ways. From the word clouds, it can be seen that “protest” is one of the frequent words used for Democrats, which carries a sentiment of -2 according to the afinn library. Meanwhile, Republicans often use the words “Riot” and “Violence” in association with Black Lives Matter, which carry a sentiment of -2 and -3, respectively. Here, we can see that while both parties are using negative words, the connotation of protest and riot are quite different. It should also be reiterated that Rep. Marjorie Taylor Greene accounted for approximately 27% of all Republican tweets about Black Lives Matter, meaning that this word cloud may be more reflective of her view rather than the Republican Party as a whole. Nonetheless, her prominence here signifies her focus of the issue, but also the notable lack of Republican tweets, especially compared to that of number of tweets by Democrats

One interesting thing to note is that “Biden” is one of the most frequently used words for all issues in Republican tweets; however, “Biden” does not appear on any of the word clouds for Democratic tweets. An interesting area for further analysis would be to determine in what context Biden’s name is being used: whether this is merely because he is the current president (thus an important taking point), or rather because they are criticizing his approach to these issues.

Key Take-aways

From the sentiment analyses conducted and the tokenization of tweets regarding issues, a few key takeaways can be seen.

There is a clear difference between the ways Democrats and Republicans tweet about issues. From our select social and policy problems, the t-test and Cohen’s d test revealed that there is a clear difference between the sentiment of the two parties. Looking at the word clouds as well, it can be seen that Democrats and Republicans seldom use similar words to describe the selected issues. This signals that instead of using Twitter as a platform to promote a bipartisanship agenda, congress members are doubling down on party divides and making this public to their constituents as well.
Republicans tend to tweet more negatively about issues than Democrats, even on issues where there is bipartisan support, such as infrastructure. This phenomenon could occur for several reasons, including the topics selected, the current administration being Democratic, or Twitter behavior in general. Another possibility is that this displays the partisan divide of congress, where even in terms of policy agreement, sentiment and enthusiasm is not equal.
Frequent or passionate tweeters have the power to dominate a party’s sentiment and views on Twitter. As seen with Rep. Marjorie Taylor Greene, one politician can dominate an issue if they tweet about it enough. To a lesser extent, this could also be applied to Rep. Pramilia Jayapal (D-WA), who was found to be the most frequent tweeter for 3 of the 4 issues select.
While Democrats lead in terms of most followed accounts, top Republicans are getting close to the same engagement. Democratic members own 20 of the 35 accounts with over 1,000,000 followers, and Rep. Alexandria Ocasio-Cortez stands out with the average likes and retweets she receives; however, accounts that received the most engagement are closer to a 50–50 split between Republicans and Democrats.
To make waves, you might have to be negative. As seen in the sentiment map of all members, some of the most talked are seen to have some of the most negative sentiments of all congress members. From this analysis, it is unclear whether negativity receives more attention, or if the most well known members of congress naturally tweet more negatively.

Areas for Further Examination

There are several areas where a further examination is warranted. First, a wider range of issues should be examined. While the four issues were chosen due to their prevalence, with the exception of infrastructure, they are pro-Democrat issues (issues Democrats are pushing for). Issues that Republicans are pushing for should be examined to see how Democrats tweet about those issues.

Additionally, different ways in which the parties discuss issues could present a fascinating insight. For instance, when it comes to gun legislation, seeing how the words “regulation,” “control,” or “2nd Amendment” affect engagements (likes and retweets) and sentiment could reveal insights on how the parties present these issues to their followers.

Finally, using a different sentiment library may be more useful for some policy issues. As seen with Black Lives Matter, words with a negative sentiment are naturally used by both parties; however, discussing “riots” and “protests” shows a different viewpoint on the issue that cannot easily be captured in a semantic value. Using a different library or a different process may be able to capture this difference between parties better.

Thank you for reading. Feel free to email me with any questions at brm2143@columbia.edu or reach out to me on LinkedIn here.

Note: All photos of congressional members are their official portraits and are therefore part of the public domain. Photos were taken from the Government Printing Office’s Member Guide Photos

Hands-on Tutorials

Take it to Twitter: Social Media Analysis of Members of Congress

Understanding Views of Congress on Social and Policy Issues through Sentiment Analysis in R

Written by Blake Robert Mills