Introduction
Time series is a series of data that is indexed in time order. The time order can be expressed in days, weeks, months or years. The most common way to visualize time series data is to use a simple line chart, where the horizontal axis plots the increments of time and the vertical axis plots the variable that is being measured. The visualization can be achieved using geom_line()
in ggplot2
or simply using the plot()
function in Base R
.
In this tutorial, I will introduce a new tool to visualize Time Series Data called Time-Series Calendar Heatmap. We will look at how Time-Series Calendar Heatmaps can be drawn using ggplot2
. We will also explore the calendarHeat()
function written by Paul Bleicher (released as open source under GPL license) which provides an easy way to create the visualization.
Motivation
If you have ever been to GitHub, you would have definitely stumbled across the below chart which shows the number of contributions made by a user over the past year. The color of the tiles are representative of the number of contributions (as described by the legend at the bottom right of the chart). Here, the calendar heatmap provides a visually appealing way to visualize the number of commits made by the user across the year in a calendar-like view, making it easy to identify daily patterns or anomalies.

Another great example can be found on the below article by Wall Street Journal, which shows the number of infected people measured over 70 years across all 50 states in the US.
http://graphics.wsj.com/infectious-diseases-and-vaccines/
Over here, the calendar heatmap makes it easy to identify yearly patterns in the number of infected people for various diseases.
Case Study
To illustrate the use of Calendar Heatmaps, we will visualize Amazon’s stock price (NASDAQ: AMZN) over the past 5 years. We will be looking at the Adjusted Closing Prices, which will be obtained through the tidyquant
package.
Packages
We will install and import the tidyquant
package to obtain the stock prices of Amazon. We will also install and import ggplot2
to perform the visualization. The R code for calendarHeat()
can be downloaded through Paul Bleicher’s Github page.
# install tidyquant
install.packages('tidyquant', repos = "http://cran.us.r-project.org")
library(tidyquant)
#install ggplot2
install.packages("ggplot2", repos = "http://cran.us.r-project.org") library(ggplot2)
#Load the function to the local through Paul Bleicher's GitHub page
source("https://raw.githubusercontent.com/iascchen/VisHealth/master/R/calendarHeat.R")
Loading the Data
amznStock = as.data.frame(tidyquant::tq_get(c("AMZN"),get="stock.prices")) # get data using tidyquant
amznStock = amznStock[year(amznStock$date) > 2012, ] # Using data only after 2012Using ggplot2
Using ggplot2
The process of creating a calendar heatmap with ggplot2
is somewhat cumbersome. We need to get the data in the right shape before the heatmap can be made. The below code lists the step as to how we can munge the data for creating the calendar heatmap using ggplot2
.
library(plyr)
library(plotly)
amznStock$weekday = as.POSIXlt(amznStock$date)$wday #finding the day no. of the week
amznStock$weekdayf<-factor(amznStock$weekday,levels=rev(1:7),labels=rev(c("Mon","Tue","Wed","Thu","Fri","Sat","Sun")),ordered=TRUE) #converting the day no. to factor
amznStock$monthf<-factor(month(amznStock$date),levels=as.character(1:12),labels=c("Jan","Feb","Mar","Apr","May","Jun","Jul","Aug","Sep","Oct","Nov","Dec"),ordered=TRUE) # finding the month
amznStock$yearmonth<- factor(as.yearmon(amznStock$date)) #finding the year and the month from the date. Eg: Nov 2018
amznStock$week <- as.numeric(format(amznStock$date,"%W")) #finding the week of the year for each date
amznStock<-ddply(amznStock,.(yearmonth),transform,monthweek=1+week-min(week)) #normalizing the week to start at 1 for every month
p <- ggplot(amznStock, aes(monthweek, weekdayf, fill = amznStock$adjusted)) + geom_tile(colour = "white") + facet_grid(year(amznStock$date)~monthf) + scale_fill_gradient(low="red", high="green") + xlab("Week of Month") + ylab("") + ggtitle("Time-Series Calendar Heatmap: AMZN Stock Prices") + labs(fill = "Price")
p

ggplot2
Using calendarHeat()
calendarHeat()
makes the process much easier. We just need to call the function and specify the below five arguments.
date
: Dates for which the data needs to be plotted.values
: Values associated with those dates.color
: The color palette. Default is r2g (red to green). Other predefined options are r2b (red to blue) and w2b (white to blue). You can create your own palette by defining a vector as shown below.ncolors
: Number of colors for the heatmapvarname
: Title for the chart
r2g <- c("#D61818", "#FFAE63", "#FFFFBD", "#B5E384") calendarHeat(amznStock$date, amznStock$adjusted, ncolors = 99, color = "r2g", varname="AMZN Adjusted Close")

When to use calendarHeat()
v/s geom_tile
in ggplot2?
calendarHeat()
is a predefined function, hence it offers less flexibility in terms of how the graph can be modified. In order to update the figure in other ways than the five arguments that are specified in the calendarHeat()
function, we need to amend the underlying code for the same. ggplot2
on the other hand offers the flexibility as we are building the visualization from ground up.
Furthermore, ggplotly
can be integrated with the ggplot2
chart to make it more interactive. For example, by using ggplotly
we will be able to the prices for each day. calendarHeat()
on the other hand cannot be integrated with ggplotly
(not sure if there is an existing package that can help achieve the same functionality).
When it comes to convenience, calendarHeat()
provides a much easier way to construct the chart. We just need to call a single function, and a relatively standard dataset can easily be visualized without having to do much of data munging.
When should Calendar Heat Maps be used?
Calendar Heat Maps are useful when "Daily Values" or "Day of the Week" values are important. If we want to view daily values for the whole year, then Calendar Heat maps are especially useful.
On the other hand, if we want to see a trend (example: seasonality, shape, stationarity etc.), calendar heat maps are not very helpful. These also do not portray a monthly or yearly trend. If we want to see an aggregate trend in the data, then simple line chart are a better way to go.
References
https://www.r-bloggers.com/ggplot2-time-series-heatmaps, https://www.rdocumentation.org/packages/iClick/versions/1.4/topics/calendarHeat, https://github.com/iascchen/VisHealth/blob/master/R/calendarHeat.R, https://www.tableau.com/about/blog/2017/2/viz-variety-show-heatmaps-66330