mwhittington

How a Lambda-backed Custom Resource saved the day!

Blog Post created by mwhittington Employee on Oct 13, 2016

Cloudformation is wonderful. Its a wonderful way of designing your infrastructure with code (JSON, YAML). But it does have its limitations and there is feature disparity between the template DSL and the AWS cli. Considering the work that's gone into implementing this system, I'm not surprised but it's still super frustrating.

 

What is Cloudformation? Its an AWS feature that allows you to design an infrastructure containing one/some/all of the AWS resources available (EC2, S3, VPC etc) with code. This "template" then becomes like any other piece of source-code; versionable and testable but more importantly, it gives the designer the ability to repeatably and reliably deploy a stack into AWS. With one template I can deploy ten identical yet unique stacks.

 

The issue I was challenged to solve was this; every time a stack we created with Cloudformation was deleted via the console (AWS UI), it would fail. This was because an S3 bucket we created still contained objects. By using the UI there wasn't way of purging the bucket before deletion (unless someone manually empties the bucket via the S3 console) BUT when deleting a stack in the same state (with an S3 bucket that contains objects) using the aws-cli it's possible to just pass a --purge flag with the delete bucket command. Obviously we wanted the stacks to behave the same, regardless of the approach we took to manage the stacks. Some people using the Cloudformation service may not be technical enough to use cli commands.

 

So my solution; create a Lambda-Backed Custom Resource that would manage the emptying and deletion of an S3 bucket.

 

A Custom Resource is similar to the other AWS resources that Cloudformation templates manages; it has a similar set of syntactical sugars. For a more detailed understanding, follow http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/template-custom-resources.html . The essential difference with a custom resource is the necessity to manage and implement the communication between the custom resource, the service its working with (currently Lambda and/or SNS) and the Cloudformation template.

 

The first step was to add the resource to the template. Here's what it looks like (both JSON and YAML):

 

"EmptyBuckets": {
  "Type": "Custom::LambdaDependency",
  "Properties": {
  "ServiceToken": {
  "Fn::Join": [
  "",
  [
  "arn:aws:lambda:",
  {
  "Ref": "AWS::Region"
  },
  ":",
  {
  "Ref": "AWS::AccountId"
  },
  ":function:emptyBucketLambda"
  ]
  ]
  },
  "BucketName": {
  "Ref": "S3BucketName"
  }
  }
  }

 

And the YAML version:

 

EmptyBuckets:
    Type: Custom::LambdaDependency
    Properties:
      ServiceToken:
        Fn::Join:
        - ''
        - - 'arn:aws:lambda:'
          - Ref: AWS::Region
          - ":"
          - Ref: AWS::AccountId
          - ":function:emptyBucketLambda"
      BucketName:
        Ref: S3BucketName

 

The most important part of the resource is the "Properties" section. A "ServiceToken" element must be declared and this must be the ARN (Amazon reference number) of either a SNS topic or a Lambda function. In the examples above we are using references of the region and account the stack is deployed to as that is where the Lambda function has also been uploaded. After the "ServiceToken" element is declared any values can be passed in as either primitive types, objects or arrays. These are the arguments that the Lambda function will pick up for the code to handle as desired. As seen in the example, we simply pass the "BucketName" as that's all the Lambda needs to perform its tasks. The custom resource will fire upon the state change of the template (Create, Update, Delete) and within the Lambda we can decide which states we are interested in and handle that accordingly.

 

Moving onto the Lambda code now, we can see and point to the "BucketName" parameter being passed in:

 

'use strict';


var AWS = require('aws-sdk');
var s3 = new AWS.S3();


exports.handler = (event, context) => {
    if (!event.ResourceProperties.BucketName) {
        return sendResponse(event, context, "FAILED", null, "BucketName not specified");
    }
    var bucketName = event.ResourceProperties.BucketName;
    var physicalResourceId = `${bucketName}-${event.LogicalResourceId}`;
    if (event.RequestType === 'Delete') {
        console.log(JSON.stringify(event, null, '  '));
        // Is the bucket versioned?
        s3.getBucketVersioning({ 'Bucket': bucketName }, (err, data) => {
            if (err) return sendResponse(event, context, "FAILED", null, err);
            console.log('Versioning status: ', JSON.stringify(data));
            switch (data.Status) {
                case "Enabled":
                // Initial params without markers
                return emptyVersionedBucket({
                    'Bucket': bucketName
                }, event, context, physicalResourceId);
                default:
                // Initial params without continuation
                return emptyBucket({
                    'Bucket': bucketName
                }, event, context, physicalResourceId);
            }
        });
    } else return sendResponse(event, context, "SUCCESS", physicalResourceId);
};

 

The ResourceProperties value in the event object contains any of the arbitrary parameters needed for the Lambda to function. If, for some reason, the BucketName param isn't present we send a response back to the pre-signed S3 Url sent with the event to notify the Cloudformation process that firing this function failed. If the event request type is 'Create' we create our own physicalResourceId. This is because if this is not specified the sendResponse function will use the logStreamName which can point to more than one resource, causing issues. As we do not have any logic to action when the request type is 'Create' we simply send a response back stating that all was successful. If the request type is 'Delete', we log the event details and call our emptyBucket function as shown below:

 

function emptyBucket(objParams, event, context, physicalResourceId) {
    console.log("emptyBucket(): ", JSON.stringify(objParams));
    s3.listObjectsV2(objParams, (err, result) => {
        if (err) return sendResponse(event, context, "FAILED", physicalResourceId, err);


        if (result.Contents.length > 0) {
            var objectList = result.Contents.map(c => ({ 'Key': c.Key }));
            console.log(`Deleting ${objectList.length} items...`);
            var obj = {
                'Bucket': objParams.Bucket,
                'Delete': {
                    'Objects': objectList
                }
            };


            s3.deleteObjects(obj, (e, data) => {
                if (e) return sendResponse(event, context, "FAILED", physicalResourceId, e);
                console.log(`Deleted ${data.Deleted.length} items ok.`);


                // If there are more objects to delete, do it
                if (result.isTruncated) {
                    return emptyBucket({
                        'Bucket': obj.Bucket,
                        'ContinuationToken': result.NextContinuationToken
                    }, event, context, physicalResourceId);
                }
                return checkAndDeleteBucket(objParams.BucketName, event, context, physicalResourceId);
            });
        } else return checkAndDeleteBucket(objParams.BucketName, event, context, physicalResourceId);
    });
}

So first we list all of the objects in a bucket and if there was any error, bail out sending a response. The listObjects method will only return a maximum of 1000 items per call so if we have more we need to make subsequent requests with a token property. Next we create a list of objects to delete based on the format required for the deleteObjects method in the aws-sdk. Again, if there is an error we send a response stating so. Otherwise, the first batch of items were deleted and we then check that the listed objects result was truncated or not. If so, we make a recursive call to the emptyBucket function with the continuation token needed to get the next batch of items.

 

You may have noticed logic based on whether the S3 bucket has versioning enabled or not. If the bucket has versioning enabled, we need to handle the listing and deletion of objects a little differently as shown below:

 

function emptyVersionedBucket(params, event, context, physicalResourceId) {
console.log("emptyVersionedBucket(): ", JSON.stringify(params));
    s3.listObjectVersions(params, (e, data) => {
        if (e) return sendResponse(event, context, "FAILED", physicalResourceId, e);
        // Create the object needed to delete items from the bucket
        var obj = {
            'Bucket': params.Bucket,
            'Delete': {'Objects':[]}
        };
        var arr = data.DeleteMarkers.length > 0 ? data.DeleteMarkers : data.Versions;
        obj.Delete.Objects = arr.map(v => ({
            'Key': v.Key,
            'VersionId': v.VersionId
        }));


        return removeVersionedItems(obj, data, event, context, physicalResourceId);
    });
}

 

function removeVersionedItems(obj, data, event, context, physicalResourceId) {
    s3.deleteObjects(obj, (x, d) => {
        if  return sendResponse(event, context, "FAILED", null, x);


        console.log(`Removed ${d.Deleted.length} versioned items ok.`);
        // Was the original request truncated?
        if (data.isTruncated) {
            return emptyVersionedBucket({
                'Bucket': obj.Bucket,
                'KeyMarker': data.NextKeyMarker,
                'VersionIdMarker': data.NextVersionIdMarker
            }, event, context, physicalResourceId);
        }


        // Are there markers to remove?
        var haveMarkers = d.Deleted.some(elem => elem.DeleteMarker);
        if (haveMarkers) {
            return emptyVersionedBucket({
                'Bucket': obj.Bucket
            }, event, context, physicalResourceId);
        }


        return checkAndDeleteBucket(obj.Bucket, event, context, physicalResourceId);
    });
}

 

Here we need to list the object versions, which returns a list of objects with their keys and version ids. Using this data we can make a request to delete the objects, similar to the way we deleted objects from an un-versioned bucket. Once all versioned files have been deleted we move onto deleting the bucket.

 

function checkAndDeleteBucket(bucketName, event, context, physicalResourceId) {
    // Bucket is empty, delete it
    s3.headBucket({ 'Bucket': bucketName }, x => {
        if  {
            // Chances are the bucket has already been deleted
            // as if we are here, based on the fact we have listed
            // and deleted some objects, the deletion of the Bucket
            // has already taken place, so return SUCCESS
            // (Error could be either 404 or 403)
            return sendResponse(event, context, "SUCCESS", physicalResourceId, x);
        }
        s3.deleteBucket({ 'Bucket': bucketName }, error => {
            if (error) {
                console.log("ERROR: ", error);
                return sendResponse(event, context, "FAILED", physicalResourceId, error);
            }
            return sendResponse(event,
                context,
                "SUCCESS",
                physicalResourceId,
                null,
                {
                    'Message': `${bucketName} emptied and deleted!`
                }
            );
        });
    });
}

 

The headBucket function is a very useful API to check that the bucket actually exists. If it does, we call the deleteBucket method as at this point the bucket should be empty. If all goes well we send a response that the bucket has been emptied and deleted! The sendResponse method is shown here for reference only as it was taken from here (with some minor modifications): (The Tapir's Tale: Extending CloudFormation with Lambda-Backed Custom Resources)

 

function sendResponse(event, context, status, physicalResourceId, err, data) {
    var json = JSON.stringify({
        StackId: event.StackId,
        RequestId: event.RequestId,
        LogicalResourceId: event.LogicalResourceId,
        PhysicalResourceId: physicalResourceId || context.logStreamName,
        Status: status,
        Reason: "See details in CloudWatch Log: " + context.logStreamName,
        Data: data || { 'Message': status }
    });
    console.log("RESPONSE: ", json);


    var https = require('https');
    var url = require('url');


    var parsedUrl = url.parse(event.ResponseURL);
    var options = {
        hostname: parsedUrl.hostname,
        port: 443,
        path: parsedUrl.path,
        method: "PUT",
        headers: {
            "content-type": "",
            "content-length": json.length
        }
    };


    var request = https.request(options, response => {
        console.log("STATUS: " + response.statusCode);
        console.log("HEADERS: " + JSON.stringify(response.headers));
        context.done();
    });


    request.on("error", error => {
        console.log("sendResponse Error: ", error);
        context.done();
    });


    request.write(json);
    request.end();
}

 

Now, whenever we build a stack we are confident in the knowledge that any bucket attached to the stack will be cleared out and emptied. Future improvements could be; a scalable queueing system that fires the same Lambda per queue based on how many items there are in the buckets. Potentially we could have tens of thousands of objects that need deleting and we cant be bottlenecked by this.

 

Anyways, thanks for taking the time to read this and as always I welcome your comments and contributions. Laters!

Outcomes