Getting Started with R Shiny

Take the first steps towards becoming an R shiny expert

Sheenal Srivastava
Towards Data Science

--

Photo by Luke Chesser on Unsplash

Intro to R Shiny

Shiny is an R package that allows programmers to build web applications within R. For someone like me, who found building GUI applications in Java really hard, Shiny makes it much easier.

This blog article will get you building Shiny apps straight away with working examples. First things first, make sure you install the shiny package.

install.packages("shiny")

Shiny App Structure

Like R files, Shiny apps also end with a .R extension. The app structure consists of three components, which are:

  1. A user interface object (ui)
  2. A server function
  3. A function call to shinyApp

The user interface (ui) object is used to manage the appearance and layout of the app, such as radio buttons, panels, and selection boxes.

# Define UI for app 
ui <- fluidPage(

# App title ----
titlePanel("!"),

# Sidebar layout with input and output definitions ----
sidebarLayout(

# Sidebar panel for inputs
sidebarPanel(



)
# Allow user to select input for a y-axis variable from drop-down
selectInput("y_varb", label="Y-axis variable",choices=names(data)[c(-1,-3,-4)]),
# Output: Plot
("plot", dblclick = "plot_reset")
)

The server function contains information that is required to build the app. This includes instructions to generate a plot or table and react to user clicks for example.

# Define server logic here ----
server <- function(input, output) {


# 1. It is "reactive" and therefore updates x-variable based on y- variable selection
# 2. Its output type is a plot
output$distPlot <- renderPlot({

remaining <- reactive({
names(data)[c(-1,-3,-4,-match(input$y_varb,names(data)))]
})

observeEvent(remaining(),{
choices <- remaining()
updateSelectInput(session = getDefaultReactiveDomain(),inputId = "x_varb", choices = choices)
})

}
output$plot <- renderPlot({
#ADD YOUR GGPLOT CODE HERE
subset_data<-data[1:input$sample_sz,]
ggplot(subset_data, aes_string(input$x_varb, input$y_varb))+
geom_point(aes_string(colour=input$cat_colour))+
geom_smooth(method="lm",formula=input$formula)
}, res = 96)
}

Lastly, the shinyAppfunction builds Shiny app objects based on the UI/server pair.

library(shiny)
# Build shiny object
shinyApp(ui = ui, server = server)
# Call to run the application
runApp("my_app")

What is Reactivity?

Reactivity is all about linking user inputs to app outputs; so that the display in the Shiny app is dynamically updated based on user input or selection.

The building blocks of reactive programming

Input

The input object, which is passed to the shinyServerfunction allows the user to access the app’s input fields. It is a list-like object that contains all the input data sent from the browser, named according to the input ID.

Unlike a typical list, input objects are read-only. If you attempt to modify an input inside the server function, you’ll get an error. The input can only be set by the user. This is done by creating a reactive expression, where a normal expression is passed into reactive. To read from an input, you must be in a reactive context created by a function like renderText() or reactive().

Output

To allow reactive values or inputs to be viewed in an app, they need to be assigned to the output object. Like the input object, it is also a list-like object. It typically works with a renderfunction like renderTable.

The renderfunction performs the following two operations:

  1. It sets up a reactive context that is used to automatically associate inputs with outputs.
  2. It converts the R code output into HTML code to allow it to be displayed on an app.

Reactive Expressions

Reactive expressions are used to split the code within the app into sizeable code chunks that not only reduce code duplication but also avoid re-computation.

In the below example, the reactive code receives four inputs from the user, which are then used to output results from an independent-samples t-test.

server <- function(input, output) {x1 <- reactive(rnorm(input$n1, input$mean1, input$sd1))x2 <- reactive(rnorm(input$n2, input$mean2, input$sd2))output$ttest <- renderPrint({t.test(x1(), x2())})}

Shiny App Examples

Now, that we know some basics let’s build some apps.

App One

The app uses a dummy dataset that includes three continuous variables. The requirements for the app are as follows:

  1. Produce a scatterplot between two of the three continuous variables in the dataset, where the first variable can’t be graphed against itself.
  2. User can select the x and y-variables for plotting
  3. User can select the formula for plotting (“y~x”, “y~poly(x,2)”, “y~log(x)”).
  4. The categorical variable selected is used to colour the points the scatterplot
  5. The user can select the number of rows (sample size), ranging from 1 to 1,000 for plotting.
#Load libraries
library(shiny)
library(ggplot2)
#Create dummy dataset
k<-1000
set.seed(999)
data<-data.frame(id=1:k)
data$gest.age<-round(rnorm(k,34,.5),1)
data$gender<-factor(rbinom(k,1,.5),labels=c("female","male"))
z = -1.5+((((data$gest.age-mean(data$gest.age)))/sd(data$gest.age))*-1.5)
pr = 1/(1+exp(-z))
data$mat.smoke = factor(rbinom(k,1,pr))
data$bwt<- round(-3+data$gest.age*0.15+
((as.numeric(data$mat.smoke)-1)*-.1)+
((as.numeric(data$mat.smoke)-1))*((data$gest.age*-0.12))+
(((as.numeric(data$mat.smoke)-1))*(4))+
((as.numeric(data$gender)-1)*.2)+rnorm(k,0,0.1),3)
data$mat.bmi<-round((122+((data$bwt*10)+((data$bwt^8)*2))/200)+
rnorm(k,0,1.5)+(data$gest.age*-3),1)
rm(z, pr, k)#Define UI
ui <- fluidPage(


#1. Select 1 of 3 continuous variables as y-variable and x-variable
selectInput("y_varb", label="Y-axis variable",choices=names(data)[c(-1,-3,-4)]),
selectInput("x_varb", label="X-axis variable", choices=NULL),
#2. Colour points using categorical variable (1 of 4 options)
selectInput("cat_colour", label="Select Categorical variable", choices=names(data)[c(-1,-2,-5,-6)]),
#3. Select sample size
selectInput("sample_sz", label = "Select sample size", choices = c(1:1000)),
#4. Three different types of linear regression plots
selectInput("formula", label="Formula", choices=c("y~x", "y~poly(x,2)", "y~log(x)")),
#5. Reset plot output after each selection
plotOutput("plot", dblclick = "plot_reset")

)
server <- function(input, output) {

#1. Register the y-variable selected, the remaining variables are now options for x-variable
remaining <- reactive({
names(data)[c(-1,-3,-4,-match(input$y_varb,names(data)))]
})

observeEvent(remaining(),{
choices <- remaining()
updateSelectInput(session = getDefaultReactiveDomain(),inputId = "x_varb", choices = choices)
})


output$plot <- renderPlot({
#Produce scatter plot
subset_data<-data[1:input$sample_sz,]
ggplot(subset_data, aes_string(input$x_varb, input$y_varb))+
geom_point(aes_string(colour=input$cat_colour))+
geom_smooth(method="lm",formula=input$formula)
}, res = 96)
}
# Run the application
shinyApp(ui = ui, server = server)
Figure 1: User has to select from 5 different inputs

Once the user has selected the inputs, the scatter plot below is generated.

Figure 2: Output graph from App One

There you go, that’s your first web app built. However, let’s break the code down further.

App One Explanation

The function selectInputis utilised to display the drop-down menus for a user to select the variables for plotting, the variable for colouring the points in the scatter plot, and the sample size.

Within the serverfunction, based on the user’s selection for the y-variable, the user’s choices are updated for the x-variable. The observeEventfunction is used to perform an action in response to an event. In this example, it updates the list of x-variables that can be selected for the x-axis; so that the same variable can not be plotted on both the y-axis and the x-axis.

The function renderPlotis then used to take all the inputs, x-variable, y-variable, categorical-variable, sample size, and formula to generate or ‘render’ a plot.

There you have it, your first Shiny App.

Now, what if we wanted the drop-down to be a slider instead. You can modify the code by adding this code snippet in.

ui <- fluidPage(


#3. Select sample size
# numericInput("sample_sz", "Input sample size", 10000),
sliderInput(inputId = "sample_sz",
label = "Select sample size:",
min = 1,
max = 10000,
value = 1)
)

The output will then look like what’s displayed in

Figure 3: Output from App One showing a slider for sample size

Now, let’s move onto the next example which is slightly more complicated.

App Two

In this app, we are looking at a dataset that looks at five life-threatening outcomes as a result of the COVID-19 infection. We want the app to do the following:

  1. Allow the user to select one of the five outcomes as an input
  2. Compare the different counties by the selected outcome in a forest plot
  3. If a user selects a county in the plot, then another forest plot should be generated that shows all five outcomes for the selected country.
  4. Display an error message for missing values.
#Load libraries
library(shiny)
library(ggplot2)
library(exactci)
#import the data and restrict the data to variables and patients of interest

patient.data<-read.csv("Synthea_patient_covid.csv", na.strings = "")
cons.data<-read.csv("Synthea_conditions_covid.csv", na.strings = "")
data<-merge(cons.data, patient.data)
data<-data[which(data$covid_status==1),]
data<-data[,c(12,14,65,96,165,194)]
#Get the state average for each outcome
state.average<-apply(data[,-6],MARGIN=2,FUN=mean)
#Make better names for the outcomes
names<-c("pulmonary embolism", "respiratory failure", "dyspnea", "hypoxemia", "sepsis")
#Calculate and aggregate the obs and exp by outcome and county
data$sample.size<-1
list<-list()
for (i in 1:5) {
list[[i]]<-cbind(outcome=rep(names[i],length(unique(data$COUNTY))),
aggregate(data[,i]~COUNTY, sum,data=data),
exp=round(aggregate(sample.size~COUNTY, sum,data=data)[,2]
*state.average[i],2))
}
plot.data<-do.call(rbind,list)
names(plot.data)[3]<-"obs"
#Lastly, obtain the smr (called est), lci and uci for each row
#Add confidence limits
plot.data$est<-NA
plot.data$lci<-NA
plot.data$uci<-NA
#Calculate the confidence intervals for each row with a for loop and add them to the plot.data
for (i in 1:nrow(plot.data)){
plot.data[i,5]<-as.numeric(poisson.exact(plot.data[i,3],plot.data[i,4],
tsmethod = "central")$estimate)
plot.data[i,6]<-as.numeric(poisson.exact(plot.data[i,3],plot.data[i,4],
tsmethod = "central")$conf.int[1])
plot.data[i,7]<-as.numeric(poisson.exact(plot.data[i,3],plot.data[i,4],
tsmethod = "central")$conf.int[2])
}
# Define UI for application
ui <- fluidPage(
selectInput("outcome_var", label="Outcome",unique(plot.data$outcome)),
plotOutput("plot", click = "plot_click"),
plotOutput("plot2")
)
server <- function(input, output) {

output$plot <- renderPlot({
#Output error message for missing values
validate(
need( nrow(plot.data) > 0, "Data insufficient for plot")
)
ggplot(subset(plot.data,outcome==input$outcome_var), aes(x = COUNTY,y = est, ymin = pmax(0.25,lci),
ymax = pmin(2.25,uci))) +
geom_pointrange()+
geom_hline(yintercept =1, linetype=2)+
coord_flip() +
xlab("")+ ylab("SMR (95% Confidence Interval)")+
theme(legend.position="none")+ ylim(0.25, 2.25)
}, res = 96)

#Output error message for missing values
#modify this plot so it's the same forest plot as above but shows all outcomes for the selected county

output$plot2 <- renderPlot({
validate(
need( nrow(plot.data) > 0, "Data insufficient for plot as it contains a missing value")
)
#Output error message for trying to round a value when it doesn't exist/missing
validate(
need( as.numeric(input$plot_click$y), "Non-numeric y-value selected")
)


forestp_data<-plot.data[which(plot.data$COUNTY==names(table(plot.data$COUNTY))[round(input$plot_click$y,0)]),]
ggplot(forestp_data, aes(x = outcome,y = est, ymin = pmax(0.25,lci), ymax = pmin(2.25,uci))) +
geom_pointrange()+
geom_hline(yintercept =1, linetype=2)+
coord_flip() +
xlab("")+ ylab("SMR (95% Confidence Interval)")+
theme(legend.position="none")+ ylim(0.25, 2.25)
})

}
# Run the application
shinyApp(ui = ui, server = server)

App Two Explanation

In this app, like App One, the user makes a selection for an outcome from the drop-down box. This results in a forest plot. The calculations for this forest plot are done outside of the Shiny user interface. The tricky bit in this application is to build a reactive second plot in response to the user clicked county in the first plot.

This reactivity happens in the server function using this line

plot.data[which(plot.data$COUNTY==names(table(plot.data$COUNTY))[round(input$plot_click$y,0)]),]

where we checked to see which county had been selected in order to display the second forest plot for all five outcomes.

Figure 4: First forest plot in App 2 is generated based on selected outcome
Figure 5: Clicking on the first forest plot for a given county, generates a second plot below showing all five outcomes for the selected county

In the app, an error message is displayed using the validatefunction where if there is missing data, then there is an error displayed due to this piece of syntax:

[round(input$plot_click$y,0)],]

App Three

The final web app’s objective is to allow the user to plot the relationship between the variables in the data set. Some variables are categorical, while others are continuous. When plotting two continuous variables, the user should only be able to plot them on a scatter plot. When one continuous variable and one categorical variable is selected, the user should only be able to make a box plot. The user should not be able to make a plot with two categorical variables. The app should only display one plot at a time, based on the user’s selection.

#Load libraries
library(shiny)
library(ggplot2)
#import data
setwd("C:/Users/User/Desktop/Synthea")
patient.data<-read.csv("Synthea_patient_covid.csv", na.strings = "")
obs.data<-read.csv("Synthea_observations_covid.csv", na.strings = "")
patient.data$dob<-as.Date(patient.data$BIRTHDATE, tryFormats = "%d/%m/%Y")
patient.data$enc.date<-as.Date(patient.data$DATE, tryFormats = "%d/%m/%Y")
patient.data$age<-as.numeric((patient.data$enc.date-patient.data$dob)/365.25)
data<-merge(obs.data, patient.data)
data<-na.omit(data[,c(4,5,10:12,20,24,32,18,15,23,31,33,48:51,60)])
names(data)<-substr(names(data),1,10)
data$covid_stat<-as.factor(data$covid_stat)
data$dead<-as.factor(data$dead)
# Define UI for application
ui <- fluidPage(
#1. Select 1 of many continuous variables as y-variable
selectInput("y_varb", label="Y-axis variable",choices=names(data)[c(-1,-2,-14,-15,-16,-17)]),

#2 Select any variable in dataset as x-variable
selectInput("x_varb", label="X-axis variable", choices=names(data)),


#3. Reset plot1 output after each selection
plotOutput("plot", dblclick = "plot_reset")
)
server <- function(input, output) {

remaining <- reactive({
names(data)[-match(input$y_varb,names(data))]
})

observeEvent(remaining(),{
choices <- remaining()
updateSelectInput(session = getDefaultReactiveDomain(),inputId = "x_varb", choices = choices)
})

output$plot <- renderPlot({
if ( is.numeric(data[[input$x_varb]]) ) {
ggplot(data, aes_string(input$x_varb, input$y_varb)) + geom_point()
} else {
ggplot(data, aes_string(input$x_varb, input$y_varb)) + stat_boxplot()
}
})


}
# Run the application
shinyApp(ui = ui, server = server)
Figure 6: Boxplot displayed if a categorical and a continuous variable are selected
Figure 7: Scatter plot is displayed if two continuous variables are selected

App Three Explanation

The tricky bit in this app is using the if-else condition within renderPlotso that only one of the two types of plots (scatter plot or box plot) can be selected. Furthermore, when you want to check the class of the input$x_varb the use of double square brackets (list) is required, data[[input$x_varb]],otherwise the output will always be character.

Conclusion

So, there you have it, three simple apps. Let’s recap very quickly on what was covered in this blog article.

  • Structure of an R Shiny app
  • Linking inputs and outputs
  • Creating a dynamic user interface by using reactive() and observeEvent
  • Displaying error messages using validate

--

--