This article covers the usage of S3 Lifecycle Rules to archive bucket objects in Glacier Flexible Retrieval
. Then we’re gonna go over
retrieving the archived documents using the AWS SDK v3 and notifying the user when the document is ready for download by sending an email with a download link.
As always, you can see the finished code on GitHub and I have also posted this in video format which covers some additional ground related to the UI!
Adding lifecycle rules
First off, I’m gonna need to add a lifecycle rule so that the documents get archived after ONE_DAY
which is the minimum amount of time that aws allows.
const bucket = new Bucket(stack, 'uploads', {
name: 'exanubes-upload-bucket-sst',
cors: [
{
allowedMethods: [HttpMethods.POST],
allowedOrigins: ['http://localhost:5173'],
allowedHeaders: ['*']
}
],
cdk: {
bucket: {
blockPublicAccess: BlockPublicAccess.BLOCK_ALL,
versioned: true,
lifecycleRules: [
{
enabled: true,
noncurrentVersionTransitions: [
{
transitionAfter: Duration.days(1),
storageClass: StorageClass.GLACIER
}
],
transitions: [
{
transitionAfter: Duration.days(30),
storageClass: StorageClass.GLACIER
}
]
}
]
}
}
});
So to the bucket configuration from previous articles I added a lifecycleRules
prop, which takes in an array of different rules. In this case I’m going to
transition non-current versions to Glacier after 1 day and current versions after 30 days which is just for posterity as I will not use it in this example.
Recognizing archived documents
In the previous article , I’ve added a zod validator for transforming the s3 api’s response . I’m gonna add a new
StorageClass
prop which will tell me each version’s storage class and then I can transform it into a boolean value for ease of use.
export const versionResponseValidator = z
.object({
//...properties
StorageClass: z.string().optional()
})
.transform((arg, ctx) => ({
//...properties
isArchived: arg.StorageClass === 'GLACIER'
}));
Restoring archived documents
Restoring a document is an asynchronous action that’s decoupled from the actual http request that triggers it. The way it works is, first we need to tell aws that we want to restore a document, aws will start a job that’s supposed to get the document from archive and only then will it be made available to us.
export async function restoreObject(props) {
/**@type {import('@aws-sdk/client-s3').RestoreObjectCommandInput}*/
const input = {
Bucket: props.bucket,
Key: props.key,
VersionId: props.versionId,
RestoreRequest: {
Days: 1,
GlacierJobParameters: {
Tier: 'Expedited'
}
}
};
const command = new RestoreObjectCommand(input);
return client.send(command);
}
To initiate the job, we need to send a RestoreObjectCommand to aws. Here we need to define the bucket
, key
and versionId
of the document that the user wants to access.
Then we need to define the RestoreRequest
which takes in the amount of days that the document should be available for and the GlacierJobParameters
which defines the method of retrieval.
Expedited
which is the fastest method – up
to 5 minutes – but also the
most expensive one per GB retrieved. There are also Standard
and Bulk
tiers
available – up to 5 and 12 hours restore times respectively for Glacier Flexible Retrieval
storage class.Sending a notification
Now that we have the job started, we need to notify the user when the document is ready for download.
For this I’m gonna use the SNS
service which is a pub/sub service that can be used for implementing a fan-out pattern meaning
spreading out a single event among multiple subscribers, in this case I’m gonna use it to only trigger a single lambda so admittedly
this is an overkill. Good practice though.
export async function handler(event: SNSEvent) {
const [{ Sns }] = event.Records;
const s3Event: S3Event = JSON.parse(Sns.Message);
const [{ s3, glacierEventData }] = s3Event.Records;
const { key, versionId } = s3.object;
const validUntil = glacierEventData?.restoreEventData.lifecycleRestorationExpiryTime;
const signedUrl = await generatePresignedUrl({
bucketName: s3.bucket.name,
key,
versionId
});
const html = render(<GlacierObjectRestored signedUrl={signedUrl} expiry={validUntil} />);
return sendEmail(html);
}
First off, I’m parsing the SNS
message which actually contains a stringified version of the S3Event
object which is the same as the one that’s sent to the lambda when it’s triggered by an s3 event.
Then, I’m gonna extract the key
and versionId
from the s3
property and the validUntil
from glacierEventData
.
I’m generating a presigned url for the document so that it can be downloaded from the email and rendering an html template with the url and the expiry time.
Finally I’m gonna send the email using the sendEmail
function that I’ll cover shortly.
Sending an email with SES
For sending an email we’re gonna need to provide a source and destination emails, subject line and a message body .
async function sendEmail(html: string) {
const input: SendEmailCommandInput = {
Source: 'noreply@example.com',
Destination: {
ToAddresses: ['john.doe@example.com']
},
Message: {
Body: {
Html: {
Charset: 'UTF-8',
Data: html
}
},
Subject: {
Charset: 'UTF-8',
Data: 'Your document is ready!'
}
}
};
const command = new SendEmailCommand(input);
return emailClient.send(command);
}
Listening for S3 Object Events
With the lambda in place, now I can put it together with the SNS
topic and the S3
bucket.
const topic = new Topic(stack, 'objectRestored', {
subscribers: {
notificationEmail: {
type: 'function',
function: {
functionName: 'object-restored-notification-email',
handler: 'packages/functions/src/object-restored-notification-email.handler',
architecture: 'arm_64',
permissions: ['ses', 's3']
}
}
}
});
First I need to create a topic and subscribe the lambda to it and let’s not forget about the IAM permissions for SES
and S3
services that the lambda is using.
Once that’s done, I can add a notification configuration to the bucket to trigger the topic when a document on object_restore_completed
event.
const bucket = new Bucket(stack, 'uploads', {
name: 'exanubes-upload-bucket-sst',
cors: [
//...
],
cdk: {
//...
},
notifications: {
GlacierObjectRestored: {
events: ['object_restore_completed'],
type: 'topic',
topic
}
}
});
topic
not another notification to the Bucket
config.Summary
So to sum up, we’ve added a lifecycle rule that archives non-current document versions in Glacier Flexible Retrieval
storage class after 1 day. Then, by using AWS SDK,
we’ve initiated a restore job for the archived document and configured a fan-out pattern using sns and lambda to notify the user once the document was ready.
Ideally we would track the documents being restored so that we can display them correctly in the UI, we could use DynamoDB and utilise the TTL attribute or a regular relational database which is what I covered in the video version of this article.