The world’s leading publication for data science, AI, and ML professionals.

Lukewarm Sentimental Footing for The Building Bridges Initiative (BBI) in Kenya

The Building Bridges Initiative (BBI), a governance framework in Kenya is the brainchild of Kenya's President Uhuru Kenyatta, and…

The Building Bridges Initiative (BBI), a governance framework in Kenya is the brainchild of Kenya’s President Uhuru Kenyatta, and opposition leader, Raila Odinga. The idea was propelled after the March 2018 "Handshake" as a way of dealing with electoral related violations and cyclic bickering by both winners and losers in especially the presidential race in the Kenya. The BBI taskforce was formed to basically look into, and make recommendations on nine facets that Kenyatta and Odinga had decided were crucial to the effort to "create a united nation for all Kenyans living today, and all future generations". The report is out and can be found here.

However, the report was launched on 26th October 2020, a time when Kenya was battling COVID-19 coupled with a fall in the economy’s growth trajectory. Kenyans debated hard on the importance of this report, and in the course vented on Twitter for the better part of its existence. I downloaded English tweets related to BBI and followed up with some analytics on whether the online section of Kenyans (tweeters) support the initiative or not.

Data Collection

Collecting contextual tweets via Twint was possible using the below search parameters. Ideally, I collected tweets disseminated between 2020–10–26, the report launch date, and the last date of collection which was the 2nd of December 2020. For replication, the below Twint syntax will help collect the same tweets. No logins, APIs or keys are needed to collect them.

twint -s BBI – since 2020–10–26 – lang en-o BBI.csv – csv

Quick recap on the installation of Twint. Always good to create an environment. In my case it was a Pipenv one running Python 3.6, and installed via the below command in Windows Powershell:

pipenv install git+https://github.com/twintproject/twint.git#egg=twint .

Instructions on installing Pipenv can be found here .

221470 tweets spanning the collection period were collected. Varied metadata about tweets were scraped in the process. Of interest to me was the retweet count, ID and the tweet itself. The most retweeted tweet (3095 times) was by user muthonidq and was as below:-

This is a reminder for young Africans everywhere. These "liberation heroes" have lived long enough to be the villains. Everywhere we see them cling to power & change laws to suit themselves. In Kenya we are calling the power grab BBI. Don’t allow it.

The code for getting the same is below:-

Quite touching to say the least. All in all, lets have a look at the sentiments for the rest of the tweeters.

Pre-processing

A little pre-processing (cleaning) of tweets was needed for better modelling. No need to have unnecessary numbers or abbreviations that do not contribute much to the overall sentiments.

Sentiment Analysis

Sentiment analysis as defined here as the contextual mining of text which identifies and extracts subjective information in the source material. This helps get a better understanding of for example a product based on the received feedback. Text data is the most prevalent in this sphere, the reason tweets were of interest in this experiments and of course BBI is the product.

A column was added to the dataframe representing polarity score i.e. how negative to positive (-1 to 1) the tweets are using TextBlob package.

Resultant dataframe output after running the above is as below:-

Polarity scores less than 0 represent negative sentiment. 0 score is neutral , while a score above 0 is positive sentiment. To make this more understandable, I coded the same rules to output another dataframe column with this logic using the below code:-

A sample resultant dataframe is below and largely makes sense : –

A resultant piechart depicting the distribution of the sentiments across each tweet by ID is generated using the below code : –

Basically, a simple count of each sentiment per tweet ID (unique to each tweet) is computed and output as a pie chart below.

From the above, the below conclusions and assumptions can be made:

  1. Each of the tweeters in the dataset is someone who is affected by the BBI related conversation, the reason why they tweeted about it. High chance they are Kenyans as this document is specific to Kenyan Politics.
  2. Some of the tweets were written in Swahili or a mixture of Swahili and English. Textblob modelled English tokens in curating sentiments. The few Swahili ones were ignored. I’m not certain that translating them to English will retain their semantics especially contextually. This can be proven in further experiments.
  3. As it stands, a large section of Kenyans and interested persons are lukewarm about the BBI. 61.6 percent is a huge number and this is a wake up call to the proponents of the BBI. Its either Kenyans don’t have much information on the exercise, or may be, they are just not ready. 24.4% support the same while 14.4% are out rightly negative about it. Its a huge task ahead for the proponents of the exercise to tilt the neutral numbers.

Let’s connect via my Linkedn page. The Jupyter notebook with these code snippets can be found here. Download your version of the data as elicited in the document, and analyse using the code in the notebook.


Related Articles