The world’s leading publication for data science, AI, and ML professionals.

Survival analysis in clinical trials

How do we model drug efficacy? Part 1 – Kaplan-Meier estimator

Survival analysis in clinical trials – Kaplan-Meier estimator

Modeling drug efficacy and the math behind it

Photo by Halacious on Unsplash
Photo by Halacious on Unsplash

From discovering an active substance to establishing its dosing, safety and efficacy, the journey of any drug being introduced to the market is a long and costly one. Ultimately, how do the authorities decide that a new treatment has benefits for patients and should from now on be administered by the doctors?

The most expensive and arguably most important parts of drug development are the later stages of Clinical Trials. During those, the new treatment is given to the target group of patients and (usually) there is also a control group, which is given placebo. The patients are not aware which one of the treatments they are getting and nor are the doctors. This is called double-blinding and ensures the results are not influenced by the degree of belief in the new treatment.


Data format

The first and most frequent analysis that is done when evaluating clinical trial data is a nonparametric estimation of the survival function with the Kaplan-Meier estimator. We will explain this estimator on a data set ovarian from the R package survival. It comes from a study [1] published in 1979 and includes observations of 26 patients with ovarian carcinoma. At the beginning of the study, half of the patients where randomized to the standard treatment group and the other half to the combined treatment group. In this article, we only use the following variables from the data set:

  • futime: survival or censoring time
  • fustat: censoring status (0 = censored, 1 = observed)
  • rx: treatment group (1 = cyclophosphamide alone, 2 = cyclophosphamide plus adriamycin)

Let us now look at the first six observations.

library(survival)
data(ovarian)
head(ovarian)
##   futime fustat     age resid.ds rx ecog.ps
## 1     59      1 72.3315        2  1       1
## 2    115      1 74.4932        2  1       1
## 3    156      1 66.4658        2  1       2
## 4    421      0 53.3644        2  2       1
## 5    431      1 50.3397        2  1       1
## 6    448      0 56.4301        1  1       2

For patients that are alive (fustat=0), we only know that they survived until the time of analysis, but have no more information on their survival time. The output of the function Surv() in R is the array of survival times with a plus sign added to those patients for which death has not been observed.

Surv(ovarian$futime, ovarian$fustat)
## [1]   59   115   156   421+  431   448+  464   475   477+  563 
## [11]  638   744+  769+  770+  803+  855+ 1040+ 1106+ 1129+ 1206+
## [21] 1227+  268   329   353   365   377+

Having this information, how do we go about estimating, say, the median survival time for patients in this study?

Intuition about the survival function

First, we estimate the so-called survival function. If T is the random variable denoting time to death, the survival function S(t) is defined as

This is just the complement to the probability of dying before or at the time t, which is the cumulative distribution function F(t) of T:

So how can S(t) be estimated? Let us first only focus on the subset of patients who died during the follow-up time. You can check that there are twelve such patients (with fustat=1). Based on this data set, how do we estimate the probability that, for example, a patient will live for at least 10 months after the beginning of therapy?

Intuitively, the answer is easy. We calculate the proportion of patients who lived for longer than 10 months, and this will be our estimate of the survival function at 10 months. In this case, it is 8/12 = 2/3.

In fact, we can repeat this exercise for any time point we choose. If no patient died between 10 and 11 months, then the estimate of surviving 11 months will be the same as that for 10 months.

Generally, the answer will differ only if at least one death occurred between the time points in question. Calculating and plotting this probability for all the time points, we obtain the following curve:

Figure 1: Survival function estimate using only patients whose death was observed.
Figure 1: Survival function estimate using only patients whose death was observed.

The steps of the function are exactly the times of deaths.

Estimating survival probability correctly

Obviously, ignoring the patients who are still alive led to an underestimation of the true survival probabilities, since by definition we only analysed a subset of patients with the shortest survival times. For all other patients, we do not know when death will occur, but we have a lower bound for their survival times.

Suppose we want to estimate the probability of surviving past 20 months, but now in the correct way, taking into account the patients who are alive at the time of analysis. This time, we first look at the plot that we get from R using plot.survfit(), with the fit

km <- survfit(Surv(futime/30.42, fustat) ~ 1, data = ovarian)

The points on the curve appear at times when a patients is censored, i.e. times after which we have no information on that patient’s survival time.

Figure 2: Survival function estimate in R. The points on the curve indicate a censoring event.
Figure 2: Survival function estimate in R. The points on the curve indicate a censoring event.

How did R calculate these estimates? Up until the first censored event, we can do the calculations the same way we did it above, except now we have a larger number of patients. And since those added patients are survivors, this will make the probability of survival higher.

Now the (correct) estimate of probability of surviving for more than 10 months is the proportion of all patients who survived for 10 months or more. This proportion is (26–4)/26, which is about 0.846.


Deriving the Kaplan-Meier estimator

To calculate the survival probability after the first censoring event, we use the law of total probability. Let us denote the time of the _i_th event (death or censoring) with tᵢ. Then the probability of survival until at least t is

The first death after the first censored event occurs at t₁₀ (it’s a 10th event in total), which is about 14 months. The probability of surviving until t₇, t₈ and t₉ appears to be the same on the graph. Intuitively, this is because no known deaths occurred, so we cannot update our estimate accordingly. Hence the equation above becomes

It remains to estimate the probability of dying at t₁₀, given that a patient survived until t₉. This is easily done, as from our data we know that at t₁₀, one patient died out of the remaining 26–9=17 patients who we know were still alive (this is called the number at risk at t₁₀). Hence we take this probability to be 1/17.


But now we have a method of finding out the probability of survival for all times up until the time of the last event, t₂₆.

We just derived what is called the Kaplan-Meier estimator of the survival function S(t), the probability of surviving for longer than t. Generally, for any t and a patient’s survival time T, we can rewrite the verbose equation above as

with t being the largest time of event smaller than t. More concisely, the Kaplan-Meier estimator is written as

The variable r is the number of patients at risk at time t and d is the number of deaths occurring at t (note that with our method this is always equal to 1).

As the number of events approaches infinity, this estimator converges to the true survival function S(t), and for this reason it is also called the product limit estimator.

All the terms necessary for the calculation of the Kaplan-Meier estimator in the above formula are easily obtained in R as follows.

summary(km)
##   time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   1.94     26       1    0.962  0.0377        0.890        1.000
##   3.78     25       1    0.923  0.0523        0.826        1.000
##   5.13     24       1    0.885  0.0627        0.770        1.000
##   8.81     23       1    0.846  0.0708        0.718        0.997
##  10.82     22       1    0.808  0.0773        0.670        0.974
##  11.60     21       1    0.769  0.0826        0.623        0.949
##  12.00     20       1    0.731  0.0870        0.579        0.923
##  14.17     17       1    0.688  0.0919        0.529        0.894
##  15.25     15       1    0.642  0.0965        0.478        0.862
##  15.61     14       1    0.596  0.0999        0.429        0.828
##  18.51     12       1    0.546  0.1032        0.377        0.791
##  20.97     11       1    0.497  0.1051        0.328        0.752

The first three colums are needed to calculate the fourth column survival, using the formula above. _tᵢ’_s are the column time, r‘s are the column ** n.risk, and _d**_ᵢ’is the column n.event. The standard error and the corresponding confidence intervals will be discussed later.

Visualizing the treatment effect

We would hope that compared to placebo, or the current drug being commonly used for treatment (called the "standard of care"), the new therapy will prolong the expected survival.

Because of the censored events, we cannot simply compare the sample mean of the survival time between the two groups to do this, as we do not have this statistic. In this article, I will only show how the effect can be visualized, but not how it is formally tested. This is now really the easy part, as we simply plot two different curves for the two different treatment groups, again using plot.survfit(), with the fit

km_groups <- survfit(Surv(futime/30.42, fustat) ~ rx, data = ovarian)
Figure 3: Kaplan-Meier curves estimated separately for the two treatment groups.
Figure 3: Kaplan-Meier curves estimated separately for the two treatment groups.

It appears that the combined treatment might prolong the survival, but we do not know at this point whether the effect is significant.


Confidence intervals

So far, we have not dealt with uncertainty. But with only a handful of patients and hence a relatively small number of events, the curve estimate can be quite inaccurate. Even if we happen to have more patients and events in other studies, just like with any estimate, it is important to assess the precision. One can obtain the pointwise confidence intervals in R easily with the summary.survfit() function. The CI’s in this example look as follows:

Figure 4: Kaplan-Meier curves with confidence intervals.
Figure 4: Kaplan-Meier curves with confidence intervals.

Note that the right portion of the curve has rather large confidence intervals, which is given by the low number of observed events later in time. But where do these these confidence bounds come from?

To compute the confidence intervals, we need an estimate of the variance of the KM estimator and its (asymptotic) distribution.

Greenwood’s formula

R uses the Greenwood’s formula [2] when computing confidence intervals for the variance estimate, which is

We can then use the property that, under certain conditions, the distribution of the KM estimate converges to normal, with the Greenwood’s variance being the asymptotic variance:

This allows us to get pointwise confidence intervals at any t, using the formula

with quantiles of the standard normal distribution. However, the default conf.type argument in R (conf.type=log) is actually making use of the distribution of the log of the S(t) estimate, and the confidence intervals have the following form:

Lastly, another type of confidence interval used is the log-minus-log (conf.type=log-log). Its bounds always lie within the (0,1) interval, which is a desirable property, as we are estimating probabilities.

Summary

The Kaplan-Meier plot, which shows an estimate of the survival function, is usually the first point of analysis of a clinical study. It can be used to visualize the treatment effect and has the advantage that it does not rely on any assumptions about the distribution of the data. The more events we observe, the closer it will be to the true survival function.

Hence, the main summarizing points are that the KM estimator

  • Serves as an estimator of the survival function.
  • Does not depend on any distributional assumptions (i.e. is non-parametric).
  • Is commonly used for visualizing the results of a clinical study.

A major limitation is that we have no way of controlling for covariates with this method. Therefore, we cannot reliably compare treatment groups in case there is an imbalance of some covariate with a significant effect on the survival time.

Limitations

  • The KM estimator is only descriptive. It cannot be used, for example, to test for a treatment effect. This can be done with a log-rank test.
  • The right portion of the curve tends to be very inaccurate, which should be taken into account when interpreting the graph.
  • We cannot control for covariates.

References

[1] J.H. Edmunson, T.R. Fleming, D.G. Decker, G.D. Malkasian, J.A. Jefferies, M.J. Webb and L.K. Kvols, Different Chemotherapeutic Sensitivities and Host Factors Affecting Prognosis in Advanced Ovarian Carcinoma vs. Minimal Residual Disease (1979). Cancer Treatment Reports, 63:241–47.

[2] M. Greenwood, The Natural Duration of Cancer (1926). Reports on Public Health and Medical Subjects, Her Majesty’s Stationary Office, London, 33, 1–26.


Related Articles