Introduction
In healthcare research, there are often multiple groups or individuals looking into the same condition or variable. They collect data independently, but expect, and often need, consistency between the people reporting the findings. Well-designed procedures must therefore include systematic measure agreement among the various data collectors. The extent of agreement among data collectors is known as interrater reliability.
Historically, measuring reliability between two raters has been achieved with simple percent agreement, which accounts for the total number of agreements out of the total number of scores. This, however, does not control for the possibility that both scorers agreed inadvertently.
Cohen’s Kappa Coefficient was therefore developed to adjust for this possibility.
Putting it simply, Cohen’s Kappa is a way to measure reliability between two raters (judges, observers), correcting for the probability of agreement occurring by chance.
For more reading on Cohen’s Kappa Coefficient, view the following:
Building a Simple Kappa Statistic App
My hope with this application was to create a simple way to input data to calculate Cohen’s Kappa Coefficient. To do so, I used streamlit, which is an open-source framework to rapidly create Data Science apps with pure python, and scikit-learn, which is an open source library used for machine learning and data analysis.
Below is a tutorial building a simple kappa statistic app with Streamlit and scikit-learn.
Setting the Stage
Let’s begin by importing the libraries we need. Most importantly, we use scikit-learn (sklearn) for our functions and streamlit for our user interface.
import pandas as pd
import streamlit as st
import sklearn
from sklearn import metrics
import os
import numpy as np
from urllib.error import URLError
import matplotlib.pyplot as plt
import base64
Title and Text
Streamlit makes it easy to add text to our app.
To do so, we use streamlit.title and streamlit.text. We add additional text to our sidebar using streamlit.sidebar.
st.title("Kappa Stat Calculator")
st.text("Measuring interrater reliability")
st.sidebar.header("About")
st.sidebar.text("""The kappa stat calculator uses the
power of scikit-learn to quickly
calculate cohen's kappa statistic
between two raters.
Upload a csv file with columns
specifying your raters names or ids.
""")

File Upload
Next, we make it possible to upload a file into our app using streamlit.file_uploader.
df = st.file_uploader("Choose a file")

Splitting into Columns
For this next part, we split the screen into two columns to maintain easy usability and interaction. This can be done with streamlit.beta_columns.
col1, col2 = st.beta_columns(2)
Left Column
The left column will display our data frame, using pandas to work with our data. We use streamlit.dataframe and add df.style.highlight to make exploring data easier, exposing potential discrepancies.
with col1:
st.dataframe(df.style.highlight_max(axis=0))

Right Column
The right column will display a graph for further data exploration at a glance. We can view a line chart with streamlit.linechart.
with col2:
st.linechart(df)

Input Fields
Next, we create text input fields to specify columns in our dataframe. In this case, our data has two raters named person1 and person2.
person1 = st.sidebar.text_input("Enter column name for person 1")
person2 = st.sidebar.text_input("Enter column name for person 2")

Function
To utilize scikit-learn’s cohen’s kappa statistic calculator, we utilIze sklearn.metrics.cohen_kappa_score and display a button with streamlit.button.
kap = sklearn.metrics.cohen_kappa_score(y1, y2,labels=None, weights=None, sample_weight=None)

Results and Celebration!
Our function will show the Cohen’s Kappa Statistic below the button using streamlit.write.
st.sidebar.write('Result: %s' % kap)

We then celebrate with streamlit.balloons!!!

Takeaways
This article provided a brief overview of how to create tool for a common healthcare research metric. I hope you came away with a better understanding of Cohen’s Kappa, streamlit, and scikit-learn.
The source code for this project can be found here.
More stories from the author:
Making it rain with raincloud plots
Introducing OpenHAC— an open source toolkit for digital biomarker analysis and machine learning