Build Shiny Dashboard with Elasticsearch

Bring data to life using Shiny Dashboard with Elasticsearch

Andy Kwan
Towards Data Science

--

In enterprise, presumably multiple data sources are required to be handled because of the possession of vast amount of data. When attempting to build a dashboard to showcase business ideas, typically one needs to integrate data from NoSQL database, relational database and search engine. Thus dashboards like Kibana or Google Data Studio are not suitable choices as the flexibility of multiple data source is limited. One then need an alternative that is as straightforward. For this scenario, I recommend Shiny Dashboard for the task as it fulfills the requirements: flexible, straight forward and aesthetic. However, while building the dashboard, there is an immediate implementation barrier of connecting Amazon Elasticsearch Service and that motivates me to write this article.

This article demonstrates the integration of Elasticsearch data into a Shiny dashboard. The programming language used is mainly R and the back-end connection is performed with Python. Meanwhile, for the data visualization part, graphs are drawn with 3 different graphing packages in R, namely: ggplot2, plotly and wordcloud2.

The contents for the article:

  1. Elasticsearch Connection (with Amazon Elasticsearch Service)
  2. Data Manipulation using R.
  3. Showcase the tool Shiny Dashboard to bring data to live without much styling customization.

Environment used:

  • R version 3.4.4
  • Python 3.7.4
  • Ubuntu 18.04

Elasticsearch is a popular search engine in enterprise. Generally speaking, I would recommend it to be added into the data infrastructure when making summary statistics or locating specific batch of large amount of data in a timely manner is necessary. For setup, a convenient way is to make use of Amazon Elasticsearch Service since it one would only need to take care of high-level parameters like number of shards. Moreover, a comprehensive documentation and sample codes are provided and there is not much reason of not using it when a company has already built the infrastructure in AWS. Amazon provides sample code which is well-documented but the supporting languages do not contain R. Although there are various packages for Elasticsearch connection in R but the way to integrate it with Amazon Web Services version 4 authentication (AWS4Auth) is not straight forward. To build the dashboard, the crucial part is to overcoming this implementation barrier.

Photo by Debby Hudson on Unsplash

Connect with Amazon Elasticsearch Service and Python Backend

First, we need to locate the Python path. I suggest two ways to search for it:

  1. Display in Python environment
import sys
# Python search for libraries at the ordering of the path
print(sys.path[0])

2. Display in command line with the chosen Python environment activated

which python

Both ways can locate the system path for your Python executable.

Once the python path is known, we can start the connection. We will be using the reticulate package in R which provides a comprehensive set of tools for interoperability between Python and R. The details can be found here and I recommend you to read this awesome cheat sheet!

This simple way works! It is always a good idea to learn both R and Python to avoid being stuck.

Photo by Mirko Blicke on Unsplash

Demonstration

As we do not have a hosted Elasticsearch Instance at the moment. For demonstration purpose, let's create some sample data in the local machine.

Steps for setting up Elasticsearch with sample data in the local machine:

  1. Download and follow the instructions of Elasticsearch.
  2. Download and follow the instructions of Kibana.
  3. Host both Elasticsearch and Kibana.
  4. Load the Sample Flight data to Kibana. Kibana is hosted on http://localhost:5601/.
Load the sample data from Kibana. (Image by author)

5. Connect to Elasticsearch

elasticsearch <- import("elasticsearch")host <- "localhost:9200"es <- elasticsearch$Elasticsearch(hosts = host)

There are various way for the connection since AWS4Auth is not used. You may use a R only approach as well.

6. Install the necessary packages for the Shiny dashboard including Shiny, shinyWidgets and shinydashboard.

Manipulate Data and Build Shiny Dashboard using R

Upon connection, we are now ready to start writing functionalities in the dashboard. First let's start with the structure:

├── global.R
├── server.R
└── ui.R

To better organize the code, I have separate the files into the above 3 parts.

global.R

As its name implies, this file stored the global variables and functions that is run once prior to app starts. Thus I have included the Elasticsearch connection and queries.

Connections and queries in global.R

server.R

It handles back-end duties once the dashboard app is started. Logic about for example how to render the plots and when to re-run queries should be included in the file.

The response data from Elasticsearch using Elasticsearch.search is in dictionary format and it is convert to a named list according to the reticulate. For the ease of making plots, it is better to transform all the response data to R data frame format and there are 2 options for us:

  1. Transform the data in R.
  2. Transform the data in Python and convert pandas.DataFrame to Data Frame in R.

For the demonstration, we will be using R. Given the complexity of the named list response data, the way to write a suitable data transformation function more efficiently is to keep observing the structure of the object using str.

Example structure of the response subset data

After multiple rounds of trail and error in the R console, we then come up with two functions that converts data from single and double aggregation queries.

Now we prepare data frames for charts plotting. First we create data frame and plot a pie chart for number of flights using Plotly R.

Function to render a plotly figure

The transformed data frame of of the below form.

Transformed dataframe for number of flights

Plotly is a popular tool for chart visualization, it allows users to create interactivity visualizations with options for interactivity. Features like zoom in and out and hover text are supported for figures made with Plotly. However rendering a chart in Plotly will be quite slow when it comes to displaying a vast amount of data compared with ggplot.

Then we create data frame and plot a stacked bar chart for flight delay time series data using ggplot. We are going to make a time slider in the dashboard for user to select the time range of the chart and therefore we have to filter by the selected time inputs.

The transformed data frame of of the below form.

Transformed data frame for flight delay

ggplot allows users to quickly visualize trends, and add customization in layers. If a more interactive chart is desired, you may convert your figure to Plotly by the code ggplotly(p) .

Finally we create data frame and word cloud for weather data using wordcloud2. Additionally, we are going to make a slider in the dashboard for user to adjust a desired wordcloud size.

The transformed data frame of of the below form.

Transformed dataframe for weather

Word clouds made with wordcloud2 automatically calculated sizes and positions for words and it supports display of raw values when mouse hover. Users can thus get an overview of the data as well as targeting specific words for data insights.

Photo by Lance Anderson on Unsplash

ui.R

It handles front-end appearance of the app. The code for the Shiny widgets and style should be placed here. Basically all the desired appearance of the dashboard should be located here. Essential elements for an interactive dashboard are the use of widgets. There are many resources for creating Shiny widgets, for example here. Once you have an idea in mind then you can simply search for the right widget and place that into the ui.R file. Apart from that, some Shiny dashboard elements can be found here. Simply make sure you have handled the data logic each time when you added a new widget.

As an example of demonstration, for our word cloud box, we create a sliderInput to control its overall size.

For further details of the UI, you are welcome to read my source code.

Finally lets host the Shiny app in the R console:

runApp(host="0.0.0.0", port=1234)

Visiting http://localhost:1234/ and you can view the demo dashboard.

The dashboard made in this demonstration. (Image by author)

Conclusion

In this article, we have walked through the essential steps to create a Shiny Dashboard to bring data to life. In particular, we have

  1. Connect Elasticsearch using Python back-end and converted the data to R named list.
  2. Creates multiple data transformation functions with R.
  3. Utilize Plotly, ggplot and wordcloud2 packages for graphs.

Thanks for reading! Hope you find this article useful. The source code is posted in my GitHub repository. Feel free to drop me a comment to tell me what you think about the article. Do tell me if you have a more convenient way for the task :)

LinkedIn:

https://www.linkedin.com/in/andy-k-1b1a44103/

--

--

Hong Kong | Data Scientist | Data Science and Machine Learning Enthusiast | Passionate in Solving Real World Problems with Data Science