When you log in to https://app.sendgrid.com to check out your account details or visit https://mc.sendgrid.com for Marketing Campaigns, you are visiting our frontend web applications hosted in AWS S3 buckets with CloudFront distributions on top of them.
This AWS S3 and Cloudfront infrastructure works well for our web applications at scale in hosting files over a content delivery network, but our initial configurations lacked tighter protections in the form of security headers.
Adding these security headers would prevent users from attacks such as cross-site scripting, MIME sniffing, clickjacking, code injection, and man in the middle attacks related to insecure protocols. If left unattended to, these would have grave consequences for our customers’ data and for our company’s trust in being able to deliver a secure experience on the web.
Before we researched how to add these headers, we first took a step back to see where we were. After running our web application URL through a security headers scan website, we unsurprisingly received a failing grade but saw a helpful list of headers to look into as shown below.
As you can see, there was much room for improvement. We researched how to configure our AWS S3 and Cloudfront resources to respond back with security headers to mitigate the risks and vulnerabilities mentioned.
At a high level, we can accomplish this by creating a Lambda@Edge function that alters the origin response headers to append the desired security headers before the web app’s files return to the user’s browser.
The strategy is to first test out hooking things up manually through the AWS Console. Then, we will put these configurations in Terraform to save this part of the infrastructure in code for future reference and shareability across other teams and applications.
What kind of security headers would we like to add?
As part of the recommendations from our Product Security team, we were tasked with adding security headers such as “Strict-Transport-Security” and “X-Frame-Options.” We recommend you also check out resources like the MDN Web Security Cheatsheet to get up to speed. Here is a short summary of the security headers that you can apply to your web applications.
This is to provide hints to the browser to access your web application through HTTPS rather than HTTP.
This is to set explicit allow lists on what kind of resources you load or connect to in your web application such as scripts, images, styles, fonts, network requests, and iframes. This was the toughest one for us to set up as we had third-party scripts, images, styles, and API endpoints to explicitly record in these policies.
Tip: Use the Content-Security-Policy-Report-Only header to help you with testing in certain environments. If certain resources violated the policies, we observed helpful console output of the resources we needed to allow in our policies.
If you would like to avoid a funny mishap with blank screens and web apps failing to load, we strongly recommend that you experiment with your policies in report only mode first and do thorough testing before feeling confident enough to deploy these security policies in production.
This is to maintain and load up assets with correct MIME types in your web page.
This is to provide rules over how your web application is potentially loaded up in an iframe.
This stops pages from loading if a cross-site scripting attack is detected in certain browsers.
This manages how the “Referer” header with information about the origin of the request is passed along when following links to external sites or resources.
With these security headers in mind, let’s get back to how our Cloudfront distributions are set up today and how Lambda@Edge functions will help us achieve our goal.
Using Lambda@Edge With our CloudFront distributions
For our CloudFront distributions, we set things up such as:
- The SSL certificates for the domains to attach on top of the CloudFront URL like https://app.sendgrid.com
- S3 bucket origins
- Origin groups with primary and replica buckets for automatic failover
- Cache behaviors
These cache behaviors, in particular, allow us to control not only how long we want the responses for certain types of paths and files to be cached in the edge servers around the world, but also provides us a way to trigger AWS Lambda functions in response to the various events such as origin requests and origin responses. You can think of AWS Lambda functions as specific code you define that will run in response to a certain event.
We modeled our approach after this AWS blog post and extended it to be easier to make changes to a specific Content Security Policy. You can model your Lambda@Edge function after the way we set things up in generating lists for the script, style, and connect sources. This function effectively modifies the Cloudfront origin response headers and appends each security header with certain values to the response before returning by calling the callback function provided as shown below.
How do we test this Lambda@Edge function?
Before you officially change up how your assets are returned with security headers, you should verify the function works after you configure everything manually through the AWS console. It is crucial that your web applications should be able to load up and function properly with the security headers added to your network responses. Last thing you want to hear is an unexpected outage occurring due to the security headers, so test thoroughly in your development environments.
It is also important to know what exactly you will be writing in Terraform code later to save this configuration in your codebase. In case you do not know about Terraform, it provides you a way to write and manage your cloud infrastructure through code.
Tip: Take a look at the Terraform docs to see if it can help you maintain your complex configurations without needing to remember all the steps you did in the cloud consoles.
How to get started in the AWS Console
Let’s get started with how to set things up manually through the AWS Console.
- First, you need to create the Lambda@Edge function in the “us-east-1” region. Going to the Lambda services page, we will click “Create Function” and name it something like “testSecurityHeaders1.”
2. You may use an existing role with permissions to run the function at edge servers or you can use one of their role policy templates such as “Basic Lambda@Edge Permissions…” and name it “lambdaedgeroletest”.
3. After creating your test lambda function and role, you should see something like this where you’ll notice the “Add Trigger” button for the function. This is where you will eventually associate the lambda with a CloudFront distribution’s cache behavior to be triggered on the origin response event.
4. Next, you need to edit the function code with the security headers code we crafted before and hit “Save.”
5. After saving the function code, let’s test out if your lambda function even works by scrolling to the top and hitting the “Test” button. You will create a test event named “samplecloudfrontresponse” using the “cloudfront-modify-response-header” event template to mock an actual CloudFront origin response event and to see how your function runs against it.
You’ll notice things like the “cf.response” headers object which your lambda function code will modify.
6. After creating the test event, you will click the “Test” button again and should see how the lambda function ran against it. It should run successfully with logs displaying the resulting response with added security headers like this.
Great, the lambda function looks like it appended the security headers to the response correctly!
7. Let’s go back up to the “Designer” area and click the “Add Trigger” button so you can associate the lambda function with your CloudFront distribution’s cache behaviors on the origin response event. Make sure to select a “CloudFront” trigger and click the “Deploy to Lambda@Edge” button.
8. Next, you select the CloudFront distribution (in our example, we cleared the input here for security reasons) and a cache behavior to associate with it.
You then choose the “*” cache behavior and select the “Origin response” event in order to match on all request paths to your CloudFront distribution and to make sure the lambda function always runs for all origin responses.
You then check off the acknowledgement before clicking “Deploy” to officially deploy your lambda function.
9. After successfully associating your lambda function with all of your relevant CloudFront distribution’s cache behaviors, you should see something similar to this in the lambda dashboard “Designer” area where you can see the CloudFront triggers and have the option of viewing or deleting them.
Making changes to your lambda code
Whenever you may need to make changes to your lambda code, we recommend:
- Publishing a new version through the “Actions” button dropdown
- Delete the triggers on the older version (you can click on the “Qualifiers” dropdown to see all the versions of your lambda)
- Associate the triggers with the newest version number you recently published
Upon deploying your lambda for the first time or after publishing a new version of your lambda and associating the triggers with the newer lambda version, you may not see the security headers right away in your responses for your web application. This is due to how the edge servers in CloudFront cache the responses. Depending on how long you set the time-to-live in your cache behaviors, you may have to wait a while to see the new security headers unless you do a cache invalidation in your affected CloudFront distribution.
After redeploying your changes to your lambda function, it often takes time for the cache to clear out (depending on your CloudFront cache settings) before your responses have the latest tweaks to your security headers.
Tip: To avoid refreshing the page a lot or sitting around unsure if your changes worked, kick off a CloudFront cache invalidation to speed up the process of clearing out the cache so you can see your updated security headers.
Go to your CloudFront services page, wait for your CloudFront distribution’s status to be deployed, meaning all the lambda associations are done and deployed, and go to the “Invalidations” tab. Click “Create Invalidation” and put “/*” as the object path to invalidate all the things in the cache and hit “Invalidate.” This should not take too long and after it’s marked as complete, refreshing your web application should see the latest security header changes.
As you iterate on your security headers based on what you find as violations or errors in your web application, you can repeat this process:
- Publishing a new lambda function version
- Deleting the triggers on the old lambda version
- Associating the triggers on the new version
- Cache invalidating your CloudFront distribution
- Testing your web application
- Repeating until you feel confident and safe things are working as expected without any blank pages, failed API requests, or console security errors.
Once things are stable, you can optionally move on to Terraforming what you just did manually into code configurations, assuming you have Terraform integrated with your AWS accounts. We will not cover how to set up Terraform from the beginning, but we will show you snippets of what the Terraform code will look like.
Terraforming the Lambda@Edge triggered by our CloudFront distribution
After iterating on the Lambda@Edge function for security headers in the “us-east-1” region, we wanted to add this to our Terraform codebase for code maintainability and version control down the road.
For all the cache behaviors that we implemented already, we had to associate the cache behavior with the Lambda@Edge function to be triggered by the origin response event.
The following steps assume you already have most of the CloudFront distributions and S3 buckets configured through Terraform. We will focus on the main modules and properties that relate to Lambda@Edge and add the trigger to the CloudFront distribution’s cache behaviors. We will not walk through how to set up your S3 buckets and other CloudFront distribution settings from scratch through Terraform, but we hope you can see the level of effort to accomplish this on your own.
We currently break up our AWS resources into separate module folders and pass in variables into those modules for flexibility in our configuration. We have an apply folder with a development and production sub-folder and each have their own main.tf file where we call these modules with certain input variables to instantiate or modify our AWS resources.
Those sub-folders also have their own lambdas folder where we hold our lambda code such as a security_headers_lambda.js file. The security_headers_lambda.js has the same code we have been using in our lambda function when we tested out manually except we are also saving it in our codebase for us to zip and upload through Terraform.
- First, we need a reusable module to zip up our lambda file before it gets uploaded and published as another version of our Lambda@Edge function. This takes in a path to our lambda folder holding the eventual Node.js lambda function.
2. Next, we add onto our existing CloudFront module which wraps the S3 buckets, policies, and CloudFront distribution resources by also creating a lambda resource built from the zipped up lambda file. The lambda zip module’s outputs will be passed as variables into the CloudFront module to set up the lambda resource. We need to specify the AWS provider region as “us-east-1” and with a working role policy like this.
3. Within the CloudFront module, we then associate this Lambda@Edge function with the CloudFront distribution’s cache behaviors as demonstrated below.
4. Finally, putting it all together in our apply/development or apply/production folder’s main.tf file, we call all these modules and pass in the proper outputs as variables into our CloudFront module as shown here.
These configuration tweaks essentially take care of the manual steps we did in the previous section to update the lambda code and associate the newer version with the CloudFront’s cache behaviors and triggers for origin response events. Woo! No need to go through or remember the AWS Console steps as long as we apply these changes to our resources.
How do we roll this out safely in different environments?
When we first associated our Lambda@Edge function with our testing CloudFront distribution, we noticed quickly how our web application would no longer load correctly. This was mainly due to how our Content-Security-Policy header was implemented and how it did not cover all the resources we were loading in our application. The other security headers posed less of a risk in terms of blocking our application from loading. In the future, we will focus on rolling out the security headers with further iterations in mind to fine tune the Content-Security-Policy header.
As mentioned earlier, we discovered how we can take advantage of the Content-Security-Policy-Report-Only header instead to minimize the risk as we gather more of the resource domains to add in each of our policies.
In this report only mode, the policies will still run in the browser and output console error messages of any violations to the policies. However, it will not outright block those scripts and sources so our web application can still run as usual. It is up to us to continue to go through the entire web application to be sure we do not miss any important sources in our policies or else it will negatively affect our customers and support team.
For each environment, you can roll out the security headers lambda, like the following:
- Publish changes to your lambda either manually or through a Terraform plan and apply for the changes to the environment with other security headers and the Content-Security-Policy-Report-Only header first.
- Wait for your CloudFront distribution status to be fully deployed with the lambda associated with the cache behaviors.
- Do a cache invalidation on your CloudFront distribution if the previous security headers are still showing up or it takes too long for your current changes to show up in your browser.
- Visit and tour your web application’s pages with the developer tools open, scanning the console for any “Report Only…” console error messages to improve your Content-Security-Policy header.
- Make changes to your lambda code to take into account those reported violations.
- Repeat from the first step until you feel confident enough to change your header from Content-Security-Policy-Report-Only to Content-Security-Policy, meaning it will be enforced for the environment.
Improving our security headers score
After successfully applying the Terraform changes to our environments and invalidating the CloudFront caches, we refreshed the pages in our web application. We kept the developer tools open to see the security headers such as HSTS, CSP, and other ones in our network responses such as the security headers shown below.
We also ran our web application through a security headers scan report such as the one on this site. We witnessed great improvements (an A rating!) from a previously failing grade, and you can achieve similar improvements after altering your S3/CloudFront setups to have security headers in place.
Moving forward with security headers
After manually setting up the security headers through the AWS Console or successfully Terraforming the solution and applying the changes to each of your environments, you now have a great foundation to iterate further and improve your existing security headers in the future.
Depending on your web application’s evolution, you may have to make your Content-Security-Policy header more specific in terms of the resources allowed for tighter security. Or you may need to add a new header entirely for a separate purpose or to fill in another security hole.
With these future changes to your security headers in your Lambda@Edge functions, you can follow similar release strategies per environment to be sure your web applications are secure from malicious attacks on the web and still working without your users ever noticing the difference.
For more articles written by Alfred Lucero, go to his blog author page: https://sendgrid.com/blog/author/alfred/