Introduction
In our last blog post we presented a demo serverless FastAPI application deployed on AWS Lambda and configured using CDK. Part of what made that architecture very simple and very scalable was the use of DynamoDB as the persistence layer. But what if the data you are storing is relational? DynamoDB can model simple relationships between data models, but sometimes an SQL database is what you really need to capture the complexity of highly related data.
Of course, one of the core features of any SQL database is a defined schema and database migrations to modify said schema, which raises the important question of how to best manage the migrations in a deployed environment. In this blog, we demonstrate a method of running database migrations for a serverless application using the AWS Custom Resource CDK construct.
Getting started
Key technologies
Two of the important technologies used in the application are SQLAlchemy, a popular Python ORM, commonly used with web application frameworks such as FastAPI and Flask, and Alembic, its sister database migration tool (written by the same author). The code examples included assume use of these libraries.
Setup
The starting point is a FastAPI application deployed as a Lambda function using CDK. The function is placed within the VPC and appropriate security groups to enable access to the RDS instance.
1 | |
2 | |
3 | |
4 | |
5 | |
6 | |
7 | |
8 | |
9 | |
10 | |
In the previous blog post, we used the mangum library to wrap the FastAPI application and convert the incoming HTTP requests into Lambda events. This time, we use the DockerImageFunction
construct with the following Dockerfile to allow us to bring in the Lambda Web Adapter, as well as package up the application and install dependencies.
1 | |
2 | |
3 | |
4 | |
5 | |
6 | |
7 | |
8 | |
9 | |
10 | |
11 | |
12 | |
13 | |
14 | |
15 | |
16 | |
17 | |
Handling the migrations
Up to here we have only set up the FastAPI application to run in Lambda. We can now set up the migrations to run on every cdk deploy
command - this involves configuring two CDK constructs in the stack:
- A migration Lambda function
- An AWS Custom Resource
Why a migration function
If we deployed the application in a traditional, server-full way, we would simply be able to run the command to execute the migrations on the server itself i.e alembic upgrade head
. Given we are deploying on Lambda, we need a way of circumventing the Docker CMD
directive that starts the uvicorn server and instead invoke the Alembic CLI command.
We do this by configuring another DockerImageFunction
that points to the same source code but with a new Dockerfile; Dockerfile.migrate
.
1 | |
2 | |
3 | |
4 | |
5 | |
6 | |
7 | |
8 | |
9 | |
10 | |
11 | |
The new Dockerfile.migrate
is given below, and simply points to the entrypoint for the Alembic console runner.
1 | |
2 | |
3 | |
4 | |
5 | |
6 | |
7 | |
8 | |
9 | |
10 | |
AWS Custom Resource
Now that we have the migration function configured, we need a way of triggering it when the stack is deployed. There are multiple ways in which this can be done such as via CDK pipelines, directly from a CI pipeline e.g Github Actions, or using an AWS Custom Resource.
Using an AWS Custom Resource brings certain advantages that make it good mechanism of invoking the migration function:
- Automatic dependency chain - the custom resource depends on the migration Lambda function so will only trigger after it is deployed
- Automatic rollback of the entire stack if the migrations fail to run
- Can achieve automated database migrations even without a CI pipeline - useful for ephemeral or test environments
1 | |
2 | |
3 | |
4 | |
5 | |
6 | |
7 | |
8 | |
9 | |
10 | |
11 | |
12 | |
13 | |
14 | |
15 | |
16 | |
17 | |
18 | |
19 | |
20 | |
21 | |
22 | |
23 | |
24 | |
25 | |
26 | |
27 | |
28 | |
29 | |
There are few key points to note about the Custom Resource. Firstly, the principle of least privilege is applied by only giving it the permissions required to invoke the migration function - that is all it needs to do, after all!
Secondly, we specify the payload of the Lambda invocation as the arguments to pass to the alembic.config.main
entrypoint that we configured in the Dockerfile.migrate
file. Note that the path to the Alembic config file is manually specified with "--config=/var/task/alembic.ini"
.
Lastly, the physical_resource_id
is set to the current time. This acts as a key to determine whether the SDK call is actually made - by making it unique it guarantees the call will be made on each deploy.
Conclusion
In this post we've shown how an AWS Custom Resource can be used to streamline database migrations in a serverless FastAPI application that uses SQLAlchemy and Alembic as the ORM and migrations tool.
If you'd like to know more, or have any other questions about AWS infrastructure, serverless or Python, please do book in a meeting with me!