First, a quick note:
Nothing about the core infrastructure described in this post is specific to Django! The code examples are from the CDN for the Weird Sheep Labs CMS which is built with Django/Wagtail, but the CDN architecture can be used with any web framework.
Introduction
Amazon CloudFront is a content delivery network (CDN) that delivers static content, such as images, videos, and web pages, to users around the world. It acts as an intermediary between the content provider and the end-user, caching content from an origin server like S3, and serving it from edge locations closer to the user for faster delivery.
In addition to lower latency and increased scalability, this method of serving static files allows you to separate your application and content servers, which can be desirable as it splits off the heavy, read-only workload of serving static content, reducing load on the application server, or, if you are deploying your Django app serverlessly, saving you precious Lambda compute time.
Building the infrastructure
Cloud Development Kit (CDK) is AWS's own IaC offering that allows developers to compose their AWS infrastructure using a variety of programming languages, one of which is Python. In this section we'll go through some of the core components of the infrastructure, but for those interested, the entire repo is available here.
The architecture of the system looks something like this:
S3 bucket
As mentioned earlier, the S3 bucket represents the origin server of the CDN. We create a bucket using the s3.Bucket
construct with restrictive access controls; all public access is blocked as the contents of the bucket should only be accessed through the CloudFront distribution, and we lock down access control to only the bucket owner. The CORS rules are somewhat redundant as all public access is blocked, but serve as a reference of what would be needed if some public access to the bucket was required.
1 | |
2 | |
3 | |
4 | |
5 | |
6 | |
7 | |
8 | |
9 | |
10 | |
11 | |
12 | |
13 | |
14 | |
15 | |
We then add a resource-based IAM policy to the bucket which grants select principals full access to the bucket and its objects, allowing them to upload files to the bucket when calling python manage.py collectstatic
, for example. The BUCKET_USERS
environment variable is just a comma-separated list of allowed principal ARNs e.g. BUCKET_USERS=arn:aws:iam::111111111111:user/user_1,arn:aws:sts::111111111111:federated-user/user_2
1 | |
2 | |
3 | |
4 | |
5 | |
6 | |
7 | |
8 | |
9 | |
10 | |
11 | |
12 | |
CloudFront distribution
The CloudFront distribution is then configured with the S3 bucket as the origin, and optionally with a custom domain and related SSL certificate. Note that the certificate has to be requested beforehand through AWS Certificate Manager.
cf_origins.S3Origin
is a high-level construct that configures all the required permissions and settings to enable CloudFront to be able to serve content from the given S3 bucket.
1 | |
2 | |
3 | |
4 | |
5 | |
6 | |
7 | |
8 | |
9 | |
10 | |
11 | |
12 | |
13 | |
14 | |
15 | |
Finally, the relevant DNS records for the custom domain are created in the Route53 hosted zone.
1 | |
2 | |
3 | |
4 | |
5 | |
6 | |
7 | |
8 | |
9 | |
10 | |
11 | |
12 | |
13 | |
14 | |
15 | |
16 | |
17 | |
18 | |
19 | |
20 | |
21 | |
22 | |
23 | |
24 | |
25 | |
26 | |
27 | |
Conclusion
Hopefully we've shown that creating a scalable, low-latency and low-cost way of serving the static files for your Django (or otherwise) application is possible with a relatively small CDK infrastructure stack. To see the code in its entirety, please head over to Github.