The world’s leading publication for data science, AI, and ML professionals.

Deploying Dashboards for Machine Learning with AWS

Sample AWS CloudFormation template & code included.

We all strive to develop useful machine learning models. After spending significant efforts on data preparation and model development, we want our models to positively impact business and the wider world. Although success depends on model performance, it’s just as important to communicate model predictions in a clear and effective way. Most machine learning applications can benefit from having a having a dashboard interface.

What are the benefits of having a dashboard?

A dashboard is a graphical user interface that shows information related to a particular business objective or process, and this can include machine learning model predictions too. Consuming visual content is typically much easier for end-users and dashboards are visual by definition. Unlike with tabular data, the eye can quickly pick out trends, outliers and other patterns in the model predictions. Some machine learning tasks are visual by nature (e.g. object detection) while others can be shown with charts (e.g. time-series forecasting) and maps (e.g. spatial–temporal forecasting). Compelling dashboard visuals can even be created for simple classification tasks, just by aggregating predictions in different ways. Given a well-designed dashboard, an end-user can make a more informed and faster decision.

Giving users the ability to interact with models and their predictions, via a dashboard, often increases trust in the model and leads to greater adoption of the model. Added to this, dashboards are an extremely common usage pattern within businesses today, and this familiarity encourages further adoption. And even if dashboards are not used in the final product, they are an invaluable tool for collecting feedback early on in the development cycle.

Figure 1: an example dashboard deployed with this solution.
Figure 1: an example dashboard deployed with this solution.

What dashboard services or tools should be used?

With the popularity of dashboards, it’s unsurprising there are a large number of services and tools to choose from. Selecting the right one for the job will depend on your specific requirements but there are a two broad categories.

  1. Managed dashboard services: such as Amazon QuickSight and Kibana.
  2. Custom dashboard tools: such as Streamlit, Panel and Dash.

As a general rule of thumb, you should choose a managed dashboard service if you need database integrations, user management and scalability AND the visualisation options are sufficient for your particular use case. When you need an extra level of customization (over the visuals and user interface), you should choose a custom dashboard tool instead.

Our examples use Streamlit because of its simplicity, flexibility and integrations with a wide variety of visualisation tools such as Altair, Bokeh and Plotly. As always though, there are tradeoffs. Custom dashboard tools can be more complex to deploy than their managed service counterparts, because you need to handle database integrations, user management and scalability.

What’s the solution for deployment?

We’ll walk through a comprehensive answer to that question in this post, and dive into important details that might otherwise be overlooked. We’ll first consider deploying our custom dashboard on Amazon EC2. Adding some additional requirements (such as security, authentication and scalability), we then discuss a more comprehensive architecture for dashboard deployment.

Going beyond just the theory, we’ve also included a customisable solution for dashboard development and deployment. An AWS CloudFormation template is shared so you can create all of the required resources inside your own AWS account with just a few clicks. You can choose to deploy one of two example dashboard applications: ‘Uber Pickups in New York City‘ (a self-contained example) and ‘DistilGPT-2 Text Generation‘ (an example that interacts with a machine learning model). All code is customisable. We’ve taken a containerised approach (with Docker), so you can use this solution with a range of custom dashboard tools.

🐙 : Click here to see code on GitHub

🚀 : Click here to launch AWS CloudFormation stack

Minimal Approach

One of the simplest sounding approaches to Streamlit deployment on AWS would be to use Amazon EC2. You can deploy your dashboard on an Amazon EC2 instance (i.e. a virtual server in the AWS Cloud) and let dashboard users connect to that instance. You can use an AWS Deep Learning AMI on a GPU instance if you need to use a deep learning model for your dashboard. When deploying on Amazon EC2, you should ask yourself the following questions:

Who can access the application and how can access be limited to certain individuals? Are sensitive communications being encrypted with HTTPS? What happens if the server crashes? Who’s going to install bug fixes and security updates on the instance? What happens if the number of users goes up and down significantly over time? Will the instance handle peak traffic? What about updates to the model and application? And this list goes on.

So although this approach is architecturally simple, there are a number of factors that could make things more complex depending on your use case. We’ll now take a look at an alternative approach that uses a number of other AWS services that can help us for a fully-featured deployment.

Comprehensive Approach

We have 3 central AWS services in this approach: Amazon SageMaker, Amazon ECS and Amazon Cognito. Amazon SageMaker is designed for seamless model training and deployment, and it works great for dashboard development too. Amazon ECR and ECS are a perfect compliment for containerised deployments. And Amazon Cognito specialises in simple and secure authentication. Combining these AWS services, we end up with the following architecture in Figure 2. We’ll then dive straight into the details.

Figure 2: Architecture of AWS components used. Some of which are optional.
Figure 2: Architecture of AWS components used. Some of which are optional.

Using Amazon SageMaker

When it comes to building, training and deploying Machine Learning models, Amazon SageMaker simplifies the experience. Within minutes you can spin up a Jupyter notebook and start deploying models on dedicated and fully managed infrastructure. Out of the box, you have access to a number of pre-build Conda environments and Docker containers. In the ‘DistilGPT-2 Text Generation’ example, the pre-built PyTorch Docker container is used to deploy the DistilGPT-2 model from transformers on a ml.c5.xlarge instance. Amazon SageMaker then provides a simple HTTP endpoint to interact with the deployed model. Our example application uses the [invoke_endpoint](https://[boto3](https://boto3.amazonaws.com/v1/documentation/api/latest/index.html).amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker-runtime.html?highlight=invoke_endpoint#SageMakerRuntime.Client.invoke_endpoint) method (from boto3) to call the text generation model.

import boto3
import json
data = {
    'text': 'In a shocking finding',
    'parameters': {
        'min_length': 100,
        'max_length': 200
    }
}
sagemaker_client = boto3.client('sagemaker-runtime')
response = sagemaker_client.invoke_endpoint(
    EndpointName='text-generation',
    ContentType="application/json",
    Accept="application/json",
    Body=json.dumps(data)
)
body_str = response['Body'].read().decode("utf-8")
body = json.loads(body_str)
print(body['text'])
# In a shocking finding, scientist discovers a herd of unicorns...

Amazon SageMaker can be used for dashboard development too. After deploying the model, the dashboard can be built and tested directly on the notebook instance. A containerized approach is taken for simplified application deployment, but you still get all the benefits of live-reload when editing files (thanks to a local volume mount). When running the dashboard container on the Amazon SageMaker Notebook Instance, you can access it via the jupyter-server-proxy at the following authenticated URL:

https://{NOTEBOOK_URL}/proxy/8501/

When you’re finished with application development on Amazon SageMaker, you can push your container to Amazon Elastic Container Registry (ECR). Similar to Docker Hub, it provides a repository for your Docker images, but it keeps the images within your AWS account for extra security and reliability.

docker tag {IMAGE_NAME} {DASHBOARD_ECR_REPOSITORY_URL}:latest
docker push {DASHBOARD_ECR_REPOSITORY_URL}:latest

Using Amazon ECS

Your dashboard Docker image is now on Amazon ECR, but the application isn’t actually running yet. Amazon Elastic Container Service (ECS) is a fully-managed service for running Docker containers. You don’t need to provision or manage servers, you just define the task that needs to be run and specify the resources the task needs. Our [example](https://github.com/awslabs/sagemaker-dashboards-for-ml/blob/c8a8ec98f1d31a35b90aee666526b437a5d78410/cloudformation/deployment/deployment.yaml#L403) task definition states that the dashboard Docker container should be run with a single vCPU and 2GB of memory. Our example service runs and maintains a specified number of instances of the task definition simultaneously. So for increased availability, you can set the desired task count of your service to 2 using the AWS CLI:

aws ecs update-service 
  --cluster {DASHBOARD_ECS_CLUSTER} 
  --service {DASHBOARD_ECR_SERVICE} 
  --desired-count 2

One of the main advantages to using an Amazon ECS service is that it constantly monitors the health of the tasks and replaces tasks that have failed or stopped for any reason. Amazon ECS services can also auto-scale the number of tasks (i.e. automatically increase or decrease) to deal with high demand at peak times and reduce cost during periods of low utilization. Our example solution also includes an Application Load Balancer which distributes traffic across tasks, integrates with Amazon Certificate Manager (for HTTPS) and authenticates traffic with Amazon Cognito.

Using Amazon Cognito

When the contents of your dashboard are private, Amazon Cognito can be used to restrict access to a certain set of users. Although this component is optional, it is enabled by default on the solution. You can integrate with social and enterprise identify providers (such Google, Facebook, Amazon and Microsoft Active Directory) but the solution creates its own user pool with application specific accounts. You just need to provide an email address during stack creation to receive the temporary login credentials.

Figure 3: Amazon Cognito Sign-in
Figure 3: Amazon Cognito Sign-in

When Amazon Cognito authentication is enabled, you will see the managed sign-in page when trying to access the application for the first time. Use the temporary login credentials, and then enter a new password for the account. When successfully logged in, you’ll be able to see your dashboard.

Summary

Sometimes dashboards for machine learning can be showcased directly from your development machine. But for the times when you need to share your dashboards with the rest of the world (or within your company), a robust and secure approach is required.

We walked through a solution that can help with this: Amazon SageMaker was used to simplify the machine learning model deployment, Amazon ECR and ECS were used to run and maintain the dashboard servers and Amazon Cognito was used to control dashboard access. AWS CloudFormation can be used to automatically create all the AWS resources for the solution in your own AWS account, and you can then customise the solution as required.

🐙 : Click here to see code on GitHub

🚀 : Click here to launch AWS CloudFormation stack


Related Articles