Optimizing CloudFront Cache with Conditional GET Requests

Discover the power of optimizing Amazon CloudFront cache with Conditional GET Requests. Learn how ETags and proper handling reduce data transfer, enhance performance, and cut infrastructure costs.

Fabian Loose - May 24, 2023

Caching in Amazon CloudFront is important because it helps to reduce the latency of your web application by serving content from the nearest edge location to the end-user.

When a user requests content from your application, CloudFront checks to see if the content is already cached in the edge location closest to the user. If it is, CloudFront serves the cached content directly to the user, reducing the round-trip time to your origin server and improving the user's experience. This can significantly reduce the load on your origin server and improve the scalability and reliability of your application.

Additionally, CloudFront allows you to set caching rules for different types of content and adjust the time-to-live (TTL) for each cache item, giving you greater control over the caching behavior of your application.

In this blog post, we'll explore how to use CloudFront's caching capabilities in conjunction with ETags and conditional HTTP GET requests to reduce the amount of data that needs to be transferred between your web application, CloudFront and your Custom Origin even further, thereby reducing latency, improving the overall performance of your application and saving bandwidth and infrastructure costs.

ETag and Conditional GET Request

ETags (Entity Tags) are HTTP headers that are used to optimize web performance and reduce unnecessary network traffic.

ETags are unique identifiers that are generated by the web server and attached to a specific version of a resource (such as a web page or an image). When the client (i.e. a web browser) requests the resource again, it can send the ETag value in the HTTP request in the “If-None-Match“ header. The server can then compare the ETag value to the one it has stored for the resource. If they match, it means that the resource has not changed since the client last accessed it, so the server can return a '304 Not Modified' response instead of sending the entire resource again. It is important to point out that a 304 response will only contain header information and no payload. This can save network bandwidth and improve web performance. This technique is called “Conditional GET request“.

Conditional GET Requests between Client, CloudFront and the Custom Origin

CloudFront as well as modern web browsers support conditional GET Requests by default. Managed AWS services such as S3 typically also take care of proper ETag handling. When it comes to a custom origin, for example an EC2 instance or an ECS cluster, your custom application is responsible for the ETag handling, and here is where proper implementations can increase the overall performance while decreasing infrastructure costs. Let’s take a look at the following scenarios to illustrate a few exemplary cases.

Case 1: Fresh CloudFront Cache

When a client requests a web page that is cached in CloudFront, then CloudFront will return a copy of the webpage without requesting the origin. The response will include the complete webpage as well as an ETag header.

If the same browser requests the same webpage some time later again, it will include the ETag in the request. Since the cached page in CloudFront is still valid and the ETags match, CloudFront will answer with a 304 response, telling the browser that the object has not changed, so the browser can display the locally cached version and mark it as valid again.

Case 2: Empty CloudFront Cache

When a client requests a web page that is not cached in CloudFront, then CloudFront will request your custom origin. The requested web page is then forwarded to the client and stored in the CloudFront cache for subsequent requests. From that moment on we are in Case 1 again.

Case 3: Stale CloudFront Cache

This particular scenario is critical to the overall efficiency of web applications, and highlights the importance of proper ETag and Conditional GET request handling by the custom origin. When a user requests a web page that has been cached in CloudFront, but the cached version is outdated, CloudFront will reach out to the custom origin to retrieve the most recent version of the page. At this point, the ETag of the cached version is included in the request to the origin.

If the ETag of the cached version matches the version of the resource in the origin, the origin server will return a 304 response to CloudFront, indicating that the content has not been modified since the cached version was created. This response is very cheap in terms of network traffic, as it only contains header information and no payload. CloudFront can then mark the cached version as still valid and return it to the user. From now on we are at Case 1 again.

Performance Gain and Cost Reduction

Properly handling ETags and conditional GET requests by the custom origin can greatly improve the performance, efficiency and infrastructure costs of a web application. This is because it reduces the amount of data that needs to be transferred between the custom origin and CloudFront.

How much performance a concrete web application gains and how much costs actually are saved by supporting ETags and Conditional GET Requests in the custom origin is hard to estimate because it heavily depends on the individual use case.

It is undeniable that a decrease in data traffic will have a positive effect in performance, efficiency, and ultimately, infrastructure costs.

Conclusion

Caching in Amazon CloudFront plays a significant role in improving the performance and user experience of web applications by reducing latency and optimizing network traffic. By leveraging CloudFront's caching capabilities in conjunction with ETags and conditional HTTP GET requests, developers can further reduce the amount of data that needs to be transferred between their application, CloudFront, and their custom origin. Proper ETag and conditional GET request handling by custom origins can make a significant difference in reducing infrastructure costs while improving performance and efficiency. By understanding and implementing these techniques, developers can optimize their web applications for a better user experience and greater scalability.