The world’s leading publication for data science, AI, and ML professionals.

All You Need to Know to Secure Apps with CloudFront Functions And S3

Let's look at security best practices that helped us pass a security audit with flying colors

Notes from Industry, What I’ve learned

AG Carinae ("Celebrity Star" Nebula) by NASA, ESA and STScI
AG Carinae ("Celebrity Star" Nebula) by NASA, ESA and STScI

Currently Amazon S3 and CloudFront are some of the best cloud services for delivering production-ready SPAs, such as Angular apps, Vue apps or React apps. Unfortunately, by default, S3 and CloudFront don’t have all security features enabled, needed to run such SPA frontend apps in production.

Recently, we have launched a platform with a frontend (Angular apps) hosted on Amazon S3 and CloudFront. In this article, I want to share main security best practices and how we implemented them for our platform. Having these security best practices in place, helped us pass a very stringent security audit with flying colors!

Overview of Security best practices with S3 and CloudFront

Both S3 and CloudFront are very mature cloud services and are also relatively straightforward to start with. However, there is much more than what first meets the eye, in particular if the services are used together. Additionally, if combined with other cloud services such as Route 53 (AWS DNS service) and Amazon Certificates Manager, they become a powerful, must-have tool suite for any web developer. Unfortunately, their flexibility and seeming simplicity can be a security pitfall. Most of us have probably heard of at least one instance of S3 data leaks in the past few months (whenever you are reading this).

Figure below shows an example infrastructure for a typical SPA, hosted on S3 and CloudFront¹. Naturally, such a frontend app would depend on a set of backend services and APIs. As the article already grew beyond what I initially expected, we will not discuss the platform’s backend in this article.

Example deployment diagram of an Angular, Vue or React application with S3 and CloudFront (by the author).
Example deployment diagram of an Angular, Vue or React application with S3 and CloudFront (by the author).

In this article, I will mainly focus on the security best practices, which help address the following questions:

  • How to secure an S3 buckets, which store sensitive user data and the application code?
  • How to securely configure a CloudFront distribution?
  • How to protect frontend apps against common OWASP threats with CloudFront Functions?

So let’s first see what we can do to protect our Angular, Vue or React frontend apps with S3 and CloudFront.

Note: I decided to structure the article based on the AWS services, as I want to make this hands on as much as possible.

Securing S3 buckets

Over the years, S3 has evolved into an extremely feature-rich cloud service. There are many use cases which can be implemented with Amazon S3. Therefore, securing S3 buckets and objects will largely depend on how they are being used. Probably, the most common use case and the one I want to examine in more detail is: hosting web apps and providing storage for users’ binary object data, such as: images, videos and documents.

Securing the S3 bucket mainly requires: locking down the access to the bucket, blocking all public access, securing data at rest, and securing data at transit.

Let’s start with configuring bucket access permissions. Firstly, we need to understand that there are several ways to control access to S3 buckets and objects: Using S3 bucket policies, S3 ACLs, S3 Access Point policies or IAM policies. If you are interested in the distinctions, you should check out: "IAM Policies and Bucket Policies and ACLs! Oh, My!".

Access control with S3 bucket policies

I will focus on configuring access controls with S3 bucket policies, as in my opinion they are most suitable for the task at hand. The following policy does a great job securing an S3 bucket.

Our example S3 bucket policy contains two statements. First statement is used to lock down read access to the bucket’s objects, i.e., only allow accessing them via a specific CloudFront distribution. The distribution is specified by using its Origin Access Identity, which __ can be easily created with CloudFront console. This is very simple to set up, but it already improves the security, because S3 bucket policies will Deny an action unless there is an explicit Allow². In our policy there is only explicit Allow for GetObject action and it only allows our CloudFront distribution to read the bucket’s objects. Note, that this policy does not prevent accessing the data through the CloudFront distribution. Anyone who knows the CloudFront URL will be able to read all the data in our S3 bucket. Later we will discus how to further limit read access to our data.

Unpacking the second statement in our policy is a little bit more involved. It is used to block adding objects to the S3 bucket, unless the write request is made by a backend service which has a BACKEND_ROLE_ID role. More formal way of reading the statement would be: Deny PutObject Action for any Principle, unless its userId starts with a BACKEND_ROLE_ID. Or conversely, allow writing to this bucket only if the Principal has assumed the role with BACKEND_ROLE_ID.

It is important to notice that in our condition we are not using role’s ARN, but rather its unique Role Id to specify the desired role. To retrieve the Role Id, we can run aws iam get-role --role-name ROLE_NAME. If for some reason you cannot use the AWS CLI, an alternative solution is to use the condition key aws:PrincipalArn _ instead of `aws:userId`,_ as it is also always included in the request. For example, you could change the above policy to specify the following condition statement, without changing the policy’s semantics.

A keen-eyed reader will notice that we also have AIDA* specified in our statement’ condition. By adding it to our condition expression, we also allow all IAM users to write objects to the S3 bucket. The reason is that AWS IAM assigns unique IDs to all users and all of those IDs have the same prefix, which is, well AIDA. While adding this condition is not desired for production (remember to always keep people away form data), it can be quite useful during development. In addition, we can use similar approach to set permissions for CI/CD pipeline, lambda functions, EC2 instances within an Auto Scaling group, and so fort³.

Note: Activating IAM Access Analyzer can be very helpful in practice, both during policy creation (where it acts as a "grammar checker") and during system’s runtime (where continuously monitors changes to security policies). Also it is free of charge.

This approach is a great way to allow backend services and external server-to-server callbacks to upload data to our S3 bucket. However, many applications will also want to enable end users to upload their own data, such as profile images. To enable uploading files directly from a frontend application to an S3 bucket, we need S3 presigned URLs. The way this process works (roughly) is to have the backend services send an upload request to S3. It then generates an encrypted upload URL, e.g., with a POST or PUT method and returns it to the calling service. Finally, the backend service sends this signed URL to the frontend client, who can then upload directly to S3. Cost reduction and better performance (especially with S3 transfer acceleration) are just some of the benefits of using the presigned URLs. Check out the following article for more details.

Uploading to Amazon S3 directly from a web or mobile application | Amazon Web Services

Securing data at rest with S3

Securing data at rest is a very broad topic, involving techniques, such as: data encryption, tokenization (anonymizing data) and masking (redacting data). S3 offers a number of useful features that can add additional layer of security to the data residing in S3. Although, securing data at rest can be a very involved process, most web applications should be fine by putting the following measures in place:

  • Enabling default bucket encryption⁴. Amazon S3 supports server-side encryption of user data and it is fully transparent to the end user. Also it can be enabled with one click in S3 console. I recommend using Amazon S3-managed keys (SSE-S3), as that will reduce costs, but also since SSE-S3 are not subject to any rate limits.
  • Activating bucket versioning. This makes S3 store a new version for every modified or deleted object from which we can restore (accidentally) compromised objects if necessary. Additionally, I find it very useful to enable MFA delete on the bucket, but it must be done by the root account. Activating versioning is very simple and similar to enabling server side encryption it doesn’t require us change application’s business logic.
  • Enabling CloudTrail Logging for S3. This is the bare minimum, which I would recommend. It enables logging S3 API calls, including calls from the console and code calls (e.g., from the backend services) to Amazon S3 APIs.
  • Finally, as additional security layer, make sure to block all public access to your S3 bucket.
Blocking all public access to an S3 bucket.
Blocking all public access to an S3 bucket.

Securing CloudFront distributions

At its core, CloudFront is a Content Delivery Network (CDN) which (apart from a caching layer) does not store any data. Therefore, in the context of securing our frontend’s CloudFront distribution, it makes sense to mainly talk about: securing data in transit and managing access control.

Our CloudFront distribution is the only entry point to our application. This means that all user requests will need to go through that CloudFront distribution. CloudFront also enables putting multiple origins behind a single distribution. This enables us to expose multiple S3 buckets through a single distribution, i.e., single (sub-)domain. All this makes securing our CloudFront distribution a very important issue.

Access control with CloudFront’s signed URLs

By restricting the access to our S3 bucket to CloudFront only, we have already significantly restricted the access controls. However, some data might require an additional layer of security. Enter CloudFront signed URLs.

Signed URLs are a very useful feature, which probably deserves an article in itself. Here I briefly discuss when and how to use them, as our security best practices would be incomplete without the signed URLs.

We usually decide to additionally protect S3 objects, when we don’t want them readable by everyone on the Web. This can be since the data is sensitive (e.g., user’s purchase invoices) or it should be only accessible to the paying users (e.g., course videos). For example, private user files should be accessible only to that specific user and not to everyone on the Web. Below an example of a signed URL, which can be used to access user_invoice.pdf.

https://example.com/user_invoice.pdf?Expires=1622924658&Signature=9MwQEvSlsWvNfv9GrW71WMiG4X...&Key-Pair-Id=APKAJXX2ABBXX6HIX

We notice that it is just a regular URL, which has three parameters appended at the end. Expires, which determines for how long the URL is valid, the (hashed and signed)Signature itself and Key-Pair-Id of a public key used to generate the Signature. If any of the parameters is omitted or incorrect CloudFront will return Forbidden, with HTTP 403 status code. Note also that in our example the signature is shortened for readability.

Generating a CloudFront signed URL and retrieving a S3 file with the signed URL (by the author).
Generating a CloudFront signed URL and retrieving a S3 file with the signed URL (by the author).

Now how do we generate such URLs? The most frequent usage pattern, that I have seen, is to have a backend signing service, which acts as a trusted signer and signs URLs when requested by users (see figure above). The URLs are typically stored "raw" (base URL, without signature) and the signing service is used to generate signed URLs on demand as they are requested. These can then be embedded in fronted pages and shown to the user, e.g., on a user dashboard.

Note: Recently, generating signed URLs got much more convenient, since now the public keys used for signed URLs (see Key-Pair-Id in the above example) can be managed through Key Groups by IAM users, without requiring AWS root account. An alternative to signed URLs are signed cookies. They both have their purpose, but generally we can achieve similar effects with both signed URLs and cookies.

Securing data in transit with CloudFront and S3

CloudFront offers a number of configuration knobs, which can easily be "turned" to improve overall security of data in transit. Here we consider data in transit to be all data flows between the viewer (user) and the origin (our S3 buckets), which go through CloudFront distribution (edge locations). Most important CloudFront config knobs include⁵:

  • Ensure that security policy is properly configured with secure TLS and cypher. This guarantees that CloudFront is using secure version of TLS protocol for HTTPS communication between CloudFront’s edge locations and your users (viewers). As a rule of thumb, I recommend using TLSv1.0 or later (ideally use only TLSv1.2_2019 if you browser compatibility allows it) and strictly avoid using the SSLv3 protocol altogether. Note: see security policy setting for a CloudFront distribution.
  • Ensure that the communication between the CloudFront distribution and the viewer is happening over HTTPS. Note: simply set viewer protocol policy to always require the viewers to use only the HTTPS protocol⁴.
  • Ensure that the communication between the CloudFront edge location and their custom origins is using HTTPS in order to fulfill compliance requirements for data in transit encryption. Note: enabled automatically by setting the viewer protocol policy as previously described.

Finally, there is also a useful S3 feature for addionally securing data in transit – VPC Endpoints, in particular Gateway Endpoints for S3. A Gateway endpoint is like a gateway to AWS public zone services. It is commonly used to enable an EC2 instance in a private subnet to access S3 (or DynamoDB), without leaving private VPC network. They work based on routing tables i.e. they add new routing rules. Security is done via VPC Endpoint policies. However, I will not discuss VPC Endpoint in more detail, as I believe that most applications work with data, which don’t necessarily require this level of isolation.


CloudFront Functions and protecting against common OWASP threats

Many of the top ten OWASP threats are not easy (or even possible) to address with only S3 and CloudFront features. Until recently, we had to use Lambda@Edge⁶, but since a couple of weeks ago we can use CloudFront Functions to achieve the same as with Lambda@Edge function, at a fraction of the cost. Below is the function, which we use to inject most common security HTTP response headers and enforce some of the best security practices. We want this function to fire before we return an response to the user. Therefore, it should be associated with a CloudFront Functions event type: viewer response.

Our CloudFront function injects several common HTTP security headers to user/viewer responses from CloudFront:

  • HTTP Strict-Transport-Security (HSTS) is an HTTP response header, which instructs the browser to always access the website using HTTPS. We add this header to protect our users from man-in-the-middle attack.
  • Content Security Policy (CSP) is an HTTP response header, which tells the browser how and where it should load the page content. For example, CSP can be used to limit loading JS scripts only from trusted sources (such as own domain, Stripe and Google). CSP plays an important role in detecting and mitigating Cross Site Scripting (XSS) and data injection attacks.
  • X-XSS-Protection is an HTTP response header, which instructs a browser to block pages from loading when it detects reflected Cross Site Scripting (XSS) attacks. This is very useful for older browsers that don’t support CSP.
  • X-Content-Type-Options is an HTTP response header, which instructs a browsers to used MIME types as-is and disables browser’s content sniffing heuristics. By adding this header to the responses we can prevent MIME Confusion attacks.
  • X-Frame-Options is an HTTP response header that indicates whether or not a browser is allowed to render a page in a <frame>, <iframe>, <embed> or <object> elements. We can use this header to prevent clickjacking attacks.
  • Referrer-Policy is an HTTP response header, which controls how much referrer information should be included with requests to external links. Referrer policy is used to ensure that there is no Cross-domain Referrer leakage.
  • Expect-CT is and HTTP response header, which enforces using Certificate Transparency policy, i.e., requires that the certificate is present in public logs and that client response has a valid signature attached to it.

Note: This function was initially implemented with Lambda@Edge. That version of the function is available as a GitHub gist here.


Closing thoughts

Amazons Well Architected Framework specifies 6 security best practices, which are broken down into 10 security questions. In this article we mainly focused on implementing data protection best practices for frontend applications, which use S3 and CloudFront for data storage and delivery. In particular, we addressed the SEC 9 (protecting data at rest) and SEC 10 (protecting data in transit) security questions. Additionally, we have seen how to protect against common OWASP threats, by using CloudFront Functions (or Lambda@Edge Functions).

My initial goal was share my experiences about securing frontend applications on AWS and some lessons learned from the last security audit. However, this turned out to be (what I think) a comprehensive guide to securing frontend apps with S3 and CloudFront functions.

Anyway, there are many tutorials explaining how to deploy SPAs such as Angular or React. However, the same is not true for comprehensive, end-to-end guide that focuses on implementing security best practices.

Thanks for reading to the end! Hope you enjoyed this one!


Endnotes

  1. Here is an actual pattern how to deploy such an application by using AWS Cloud Development Kit (CDK).
  2. Note that in AWS terminology bucket owner is actually an AWS account, not an IAM user, which created the bucket or object. Please also see the endnote 3 below.
  3. _Note that all services (and users), which want to access an S3 bucket need an explicit permission to do so. The permission can be granted via an S3 bucket policy or via an IAM policy. I prefer using the later. Therefore, in this article I assume that the BACKEND_ROLE has an associated IAM policy, which allows it to talk to S3._
  4. This can incur additional costs.
  5. In addition to these points, we can use field-level encryption s for particularly sensitive data, such as credit card information. It is additional layer of encryption on top of HTTPS and it guarantees that only your application can decrypt those fields. For example, this can prevent information leakage through system logs or diagnostic tools.
  6. AWS does offer additional service such as WAF and Shield, but one should not rely solely on them. Rather, we should always strive for defense in depth approach i.e., securing all layers of our application.

Related Articles