The world’s leading publication for data science, AI, and ML professionals.

Type I and Type II Errors in COVID-19 Serology Testing

Using R to build confusion matrices, functions, and determining the efficacy of serology test results.

Currently, the US (and other parts of the world) are experiencing another surge in COVID-19 cases. Since the beginning of the pandemic, testing has been a huge topic of conversation. I personally have not been tested for the virus. I imagine I’ll have to be tested in the future for one reason or another. This got me thinking about the efficacy of testing. Pondering test accuracy turned into an exercise on reinforcing my understanding of type I and type II errors and building functions in R. That will be the focus of this article.

Photo by United Nations COVID-19 Response on Unsplash
Photo by United Nations COVID-19 Response on Unsplash

The Tests

The test I will be focusing on in this article are called serology tests, which are more commonly referred to as antibody tests. According to the Johns Hopkins Center for Health Security (JHCHS), the function of a serology test is to determine if a person has been exposed to a specific pathogen by looking at their immune response. Another test is an RT-PCR test. The JHCHS says the RT-PCR test will indicate the presence of viral material during infection and will not indicate if a person was infected and subsequently recovered. So, in simpler terms:

Serology = Used to determine if a person had the virus.

RT-PCR = Used to determine if a person has the virus.

Sensitivity and Specificity

The measure of how well a test performs in detecting the virus is based on two values: sensitivity and specificity. Sensitivity is a measure of how well the test detects a true positive result. Specificity measures how well the test detects a true negative result. If a test is perfect at detecting positive and negative results, the sensitivity and specificity are both one. When these values are less than one, the test is subject to false positives and false negatives, otherwise known as type I and type II errors. So, for simplicity:

Type I Error = False Positive

Type II Error = False Negative

Let’s build a confusion matrix using R to reinforce this knowledge:

# Type I & Type II Error Confusion Matrix
error_table = matrix(data = c('True Positive', 'False Negative',
                              'False Positive', 'True Negative'),
                     nrow = 2,
                     byrow = T,
                     dimnames = list(c('Infected', 'Not Infected'),
                                  c('Positive', 'Negative')))
             Positive                 Negative                 
Infected     "True Pos"               "False Neg/Type II Error"
Not Infected "False Pos/Type I Error" "True Neg"

Now, let’s use the sensitivity and specificity values reported by the JHCHS for a specific test to add values to the confusion matrix. I chose the Elecsys® Anti-SARS-CoV-2 test by the company Roche. The specificity of the test is 99.81% but the sensitivity changes over time. The sensitivity is 65.5% between 0–6 days of infection, 88.1% between 7 and 13 days of infection, and 1 for 14+ days after infection. This makes sense given the purpose of the serology test is to detect antibodies. The serology test is more effective when greater time has elapsed from the moment of infection and the immune system has had time to develop the requisite antibodies. Let’s write a function to capture the different sensitivity values to use for calculations later on:

# sensitivity function for the Roche Test
sensitivity = function(days) {
  if (days <= 6)
    {sen = 0.655}
  else if (days <= 13)
  {sen = 0.881}
  else {sen = 1}
  return(sen)
}

The last piece of information we need to fill in the confusion matrix is the incidence rate, which is the proportion of the population who is currently infected. I’m going to use the "14 Day New Cases Per 100,000" for the state of California which is 306 per 100,000 as of 21 November 2020. Now that we have everything we need, let’s write some code!

# sensitivity and specificity for the Roche test for 0-6 days
sen = sensitivity(0)
spec = 0.9981
# use incidence for California as of 21 NOV 20
california = 306/100000
# write a function to compute and return the confusion matrix
covid_cm = function(n, sensitivity, specificity, incidence){
  n = n # number of tests of the general population

  infected = incidence * n
  not_infected = n - infected

  true_positive = infected * sensitivity
  false_negative = infected * (1 - sensitivity)

  true_negative = not_infected * specificity
  false_positive = not_infected * (1 - specificity)

  # create the table and round to 0 decimals
  covid_cm = matrix(data = c(round(true_positive,0), 
                             round(false_negative,0), 
                             round(false_positive, 0), 
                             round(true_negative,0)),
                    nrow = 2, 
                    byrow = T,
                    dimnames = list(c("Infected w/ COVID",
                                      "Not Infected"),
                                    c("Test Positive", 
                                      "Test Negative")))
  return(covid_cm)
}
# run the function w/ n = 10000 and observe the output
cm1 = covid_cm(10000, sen, spec, california)
cm1

here is the output:

                  Test Positive Test Negative
Infected w/ COVID            20            11
Not Infected                 19          9950

There were 19 false positives (Type I errors) and 11 false negatives (Type II errors). These numbers assume the 10,000 tests were administered on people who were between 0 and 6 days of infection. In reality, there would be far more variation between the range of days since infection in the tested population. With that said, let’s write a function to calculate the probability you’ve been infected or not given the result of your test:

true_pos_neg = function(cm){
  # calculate the true positive rate
  tp = cm[1,1] / sum(cm[,1])
  # calculate the true negative rate
  tn = cm[2,2] / sum(cm[,2])

  return(c(tp,tn))
}
# pass in the confusion matrix calculated earlier
true_pos_neg(cm1)

here is the output:

[1] 0.5128205 0.9988957

If you receive a positive test result, there’s just over a 50% chance you’ve been infected. If you receive a negative test result, there’s just under a 99.9% chance you’ve never been infected. Remember, these results assume the entire population we sampled were within their first week of infection, so we shouldn’t expect high performance given the purpose of serology tests are to determine the presence of antibodies. Let’s perform the same calculation on the population from California with a higher sensitivity value:

# sensitivity = 0.881
cm2 = covid_cm(10000, sensitivity(13), spec, 
cm2
                  Test Positive Test Negative
Infected w/ COVID            27             4
Not Infected                 19          9950
true_pos_neg(cm2)
[1] 0.5869565 0.9995982

The test performs better when the sensitivity is higher, as expected. There are still quite a few false positives, and that’s concerning. Let’s see how the numbers look when the sensitivity is its highest value.

# sensitivity = 1
cm3 = covid_cm(10000, sensitivity(14), spec, california)
cm3
                  Test Positive Test Negative
Infected w/ COVID            31             0
Not Infected                 19          9950
true_pos_neg(cm3)
[1] 0.62 1.00

This is the best test performance so far, but the number of false positives still strikes me as quite high. This got me wondering, how does incidence play into these numbers? I decided to use the incidence rate from Alabama, which as of 21 November 2020 is 566 cases per 100,000 people. I kept the sensitivity value at 1.

# use incidence from alabama w/ sensitivity = 1
alabama = 566/100000
cm4 = covid_cm(10000, sensitivity(14), spec, alabama)
cm4
                  Test Positive Test Negative
Infected w/ COVID            57             0
Not Infected                 19          9925
true_pos_neg(cm4)
[1] 0.75 1.00

Interestingly, the test performed better when applied to a population with a higher incidence rate. Keep in mind, these results will differ between populations with varying incidence rates and between individuals depending on the length of time since they were initially exposed to the virus. Finding a test with the highest sensitivity and specificity is paramount to limiting error when an individual is trying to determine the presence of antibodies to the virus.

Conclusion

The key takeaway from this analysis is that different tests serve different functions. If you’re worried you may be infected, it’s not advisable to take a serology test. It’s more advisable to take a diagnostic test to detect to presence of the virus (read more about that here). If you want to determine if you’ve ever been infected, that’s where the serology test is most effective. The higher the sensitivity and specificity values, the less error prone the test will be to Type I and Type II errors. Something I found interesting while researching for this article is that a person may become infectious well before they start to feel ill. With that in mind, it can never hurt to reduce connectivity to stop the spread.

I’d like to note I’m not a medical professional. If you believe I’ve misrepresented anything in this article, please let me know! Finally, I’d like to thank my professor Kevin Crowston for providing the inspiration (and some code) to write this article. Check out his website.


Related Articles