export

CDN for Django static content using CDK, S3 and CloudFront

2024-05-14 | 6 min read
Armand Rego
When deploying a Django app into production, serving static files is something that requires some thought - we can't just leave it to the development server anymore! Using barely more than 100 lines of Python CDK code, we can create a low-cost, low-latency distribution network for our static files with Amazon S3 and CloudFront.

First, a quick note:

Nothing about the core infrastructure described in this post is specific to Django! The code examples are from the CDN for the Weird Sheep Labs CMS which is built with Django/Wagtail, but the CDN architecture can be used with any web framework.

Introduction

Amazon CloudFront is a content delivery network (CDN) that delivers static content, such as images, videos, and web pages, to users around the world. It acts as an intermediary between the content provider and the end-user, caching content from an origin server like S3, and serving it from edge locations closer to the user for faster delivery.

In addition to lower latency and increased scalability, this method of serving static files allows you to separate your application and content servers, which can be desirable as it splits off the heavy, read-only workload of serving static content, reducing load on the application server, or, if you are deploying your Django app serverlessly, saving you precious Lambda compute time.

Building the infrastructure

Cloud Development Kit (CDK) is AWS's own IaC offering that allows developers to compose their AWS infrastructure using a variety of programming languages, one of which is Python. In this section we'll go through some of the core components of the infrastructure, but for those interested, the entire repo is available here.

The architecture of the system looks something like this:

CDN architecture. Nothing too complicated.

S3 bucket

As mentioned earlier, the S3 bucket represents the origin server of the CDN. We create a bucket using the s3.Bucket construct with restrictive access controls; all public access is blocked as the contents of the bucket should only be accessed through the CloudFront distribution, and we lock down access control to only the bucket owner. The CORS rules are somewhat redundant as all public access is blocked, but serve as a reference of what would be needed if some public access to the bucket was required.

1
cors_rule = s3.CorsRule(
2
    allowed_methods=[s3.HttpMethods.GET, s3.HttpMethods.HEAD],
3
    allowed_headers=["*"],
4
    allowed_origins=["*"],
5
    max_age=300,
6
)
7

8
bucket = s3.Bucket(
9
    self,
10
    "WeirdSheepLabsCdnBucket",
11
    bucket_name=os.environ["BUCKET_NAME"],
12
    block_public_access=s3.BlockPublicAccess.BLOCK_ALL,
13
    access_control=s3.BucketAccessControl.PRIVATE,
14
    cors=[cors_rule],
15
)

We then add a resource-based IAM policy to the bucket which grants select principals full access to the bucket and its objects, allowing them to upload files to the bucket when calling python manage.py collectstatic, for example. The BUCKET_USERS environment variable is just a comma-separated list of allowed principal ARNs e.g. BUCKET_USERS=arn:aws:iam::111111111111:user/user_1,arn:aws:sts::111111111111:federated-user/user_2

1
bucket.add_to_resource_policy(
2
    iam.PolicyStatement(
3
        actions=["s3:*"],
4
        resources=[bucket.bucket_arn, bucket.arn_for_objects("*")],
5
        principals=[iam.AnyPrincipal()],
6
        conditions={
7
            "StringLike": {
8
                "aws:PrincipalArn": os.environ["BUCKET_USERS"].split(",")
9
            }
10
        },
11
    )
12
)

CloudFront distribution

The CloudFront distribution is then configured with the S3 bucket as the origin, and optionally with a custom domain and related SSL certificate. Note that the certificate has to be requested beforehand through AWS Certificate Manager.

cf_origins.S3Origin is a high-level construct that configures all the required permissions and settings to enable CloudFront to be able to serve content from the given S3 bucket.

1
# Get SSL certificate
2
certificate = acm.Certificate.from_certificate_arn(
3
    self,
4
    "WeirdSheepLabsCdnCertificate",
5
    certificate_arn=os.environ["CERTIFICATE_ARN"],
6
)
7

8
# Create Cloudfront distribution
9
distribution = cf.Distribution(
10
    self,
11
    "WeirdSheepLabsCdnDistribution",
12
    default_behavior=cf.BehaviorOptions(origin=cf_origins.S3Origin(bucket)),
13
    domain_names=[self.fqdn],
14
    certificate=certificate,
15
)

Finally, the relevant DNS records for the custom domain are created in the Route53 hosted zone.

1
# Get existing hosted zone and create records for CDN subdomain
2
zone = route53.HostedZone.from_hosted_zone_attributes(
3
    self,
4
    "WeirdSheepLabsHostedZone",
5
    hosted_zone_id=os.environ["HOSTED_ZONE_ID"],
6
    zone_name=os.environ["HOSTED_ZONE_NAME"],
7
)
8

9
route53.ARecord(
10
    self,
11
    "WeirdSheepLabsCdnARecord",
12
    zone=zone,
13
    target=route53.RecordTarget.from_alias(
14
        route53_targets.CloudFrontTarget(distribution)
15
    ),
16
    record_name=os.environ["SUBDOMAIN"],
17
)
18

19
route53.AaaaRecord(
20
    self,
21
    "WeirdSheepLabsCdnARecord",
22
    zone=zone,
23
    target=route53.RecordTarget.from_alias(
24
        route53_targets.CloudFrontTarget(distribution)
25
    ),
26
    record_name=os.environ["SUBDOMAIN"],
27
)

Conclusion

Hopefully we've shown that creating a scalable, low-latency and low-cost way of serving the static files for your Django (or otherwise) application is possible with a relatively small CDK infrastructure stack. To see the code in its entirety, please head over to Github.

© Weird Sheep Labs Ltd 2024
Weird Sheep Labs Ltd is a company registered in England & Wales (Company No. 15160367)
85 Great Portland St, London, W1W 7LT