Introducing Querypal: Web and Mobile UI for Amazon Athena

Olalekan Elesin
Towards Data Science
6 min readDec 24, 2020

--

Querypal: Web UI for Amazon Athena

Last week before Christmas holidays at work, we enabled an internal team to start using the data lake for ad-hoc analyses via Amazon Athena. During the introductory session, we demonstrated how to connect with Amazon Athena via JDBC with DBeaver SQL client on Mac. This was a smooth process until a member of the team decided to give this a try with their Windows machine. After so many tries, my colleague was unable to establish a successful connection due a DBeaver certificate issue blocked by our internal VPN.

Even though my colleague was very appreciative the effort we put into giving them access to data anytime they needed it, we felt unaccomplished as they were not able to access the data they wanted, when they wanted it. We assessed alternative approaches, example granting them access to our AWS team account, and limiting the permissions via roles to use Amazon Athena only. This was too much of work considering the “job” our colleagues wanted to get done. Also, the cognitive work required to navigate the AWS Console for non-technical users, just to write simple SQL statements was too much.

Faced with these challenges, I mapped out the activities needed to onboard a new colleague to the SQL interface of our data lake, and asked myself: “What could I do to simplify this experience?” Before I go into the solution, permit me to briefly outline the activities:

  1. Create an IAM User
  2. Generate and download static credentials
  3. Attach data lake (Amazon Athena and AWS Glue) access policies to created IAM user
  4. Send credentials with colleague
  5. Install DBeaver SQL client on work station
  6. Download Amazon Athena JDBC driver and setup Amazon Athena connection
  7. If VPN issues, reach out to IT infrastructure team to resolve.
  8. Document in internal wiki
  9. Repeat steps 1 to 7 for many users or should there be credential loss.

It is quite clear that the steps involved would not scale, considering our team’s vision of a usable data platform.

In the remaining part of this blogpost, I will show you how I rapidly developed a serverless web UI for Amazon Athena using AWS Amplify and VueJS to query data in S3 without any of the steps above.

What is Querypal?

Querypal, inspired by Airbnb’s Airpal, is a web-based query execution tool built on top of Amazon Athena to facilitate data analysis. Querypal is built with AWS Amplify and VueJS, meaning no servers to maintain and only pay for what is used. Like Airpal, you can highlight syntax, view query history, explore databases and use Table Explorer to visualize schema of a table. Think of Amazon Athena query editor for logging into the AWS Console.

Querypal: Web UI for Amazon Athena

Use AWS Amplify to deploy Querypal

Deploying Querypal to your Amazon Web Services account is as simple as a click on the “Deploy to Amplify Console” on the GitHub project repository. All the required AWS IAM permissions have been configured.

Currently, Querypal does not support database or table creation. However, this can be easily configured with a 2 to 4 lines of IAM permissions. Once deployed to AWS Amplify, you are prompted to access the URL which shows you an authentication page managed by Amazon Cognito.

Querypal authentication by Amazon Cognito

What’s AWSome about AWS Amplify and Amazon Cognito is that you can configure Amazon Cognito to use your organization’s SAML authentication. Once the user is successfully logged in, they are ready to start exploring data with SQL.

Log in to the AWS Athena Console and create tables

For this blog, we will use the Wikimedia’s page count data, which is publicly available at ‘’. This data is available in textfile and Parquet format but we will make use of the Parquet data. Because Querypal does not have permissions to create table, I will go ahead to create my table via the Amazon Athena web console.

Querypal: user not allowed to create table
CREATE EXTERNAL TABLE wikistats_parq (
language STRING,
page_title STRING,
hits BIGINT,
retrived_size BIGINT
)
STORED AS PARQUET
LOCATION 's3://emr.presto.airpal/wikistats/parquet';

Try Amazon Athena in Querypal

Open a browser and go to ‘https://master.<amplify-app-id>.amplifyapp.com’

Querypal: Table and Schema Explorer

We will now use Amazon Athena queries to answer the questions below. Paste the queries into Querypal editor. These questions were taken from Analyze Data with Presto and Airpal on Amazon EMR blogpost on AWS Big Data Blog.

What is the most frequently viewed page with page_title that contains “Amazon”?

SELECT language, 
page_title,
SUM(hits) AS hits
FROM default.wikistats_parq
WHERE language = 'en'
AND page_title LIKE '%Amazon%'
GROUP BY language,
page_title
ORDER BY hits DESC
LIMIT 10;

On average, what page is hit most in English?

SELECT language, page_title, AVG(hits) AS avg_hits
FROM default.wikistats_parq
WHERE language = 'en'
AND page_title NOT IN ('Main_Page', '404_error/')
AND page_title NOT LIKE '%Special%'
AND page_title NOT LIKE '%index%'
AND page_title NOT LIKE '%Search%'
AND NOT regexp_like(page_title, '%20')
GROUP BY language, page_title
ORDER BY avg_hits DESC
LIMIT 10;

Using Amazon Athena Query Editor

We can also use the Amazon Athena Query Editor to execute the last query:

Amazon Athena query editor

Summary

Amazon Athena is an interactive query service that makes it easy to analyze data in Amazon S3 using standard SQL. Athena is serverless, so there is no infrastructure to manage, and you pay only for the queries that you run. In this post, I’ve shown you how easy it is to set up Querypal to run run interactive queries on Amazon Athena. With Querypal, you no longer have to share static AWS credentials, download JDBC drivers and so on. You simply provision and your data lake users explore SQL at their convenience.

Querypal is also mobile friendly, helping you put more data in to more hands. Users can write SQL and explore tables in the data lake from their mobile devices on the go.

Querypal: Login and query editor interface on mobile

Querypal is now open source, but private beta is available to test out new features.

Thanks for reading and I’d love to get your feedback. You can reach me via email, follow me on Twitter or connect with me on LinkedIn.

--

--

Enterprise technologist with experience across technical leadership, architecture, cloud, machine learning, big-data and other cool stuff.