Search code examples
amazon-web-servicesaws-lambdaaws-cloudformationaws-sam

How to ensure using the latest lambda layer version when deploying with CloudFormation and SAM?


we use CloudFormation and SAM to deploy our Lambda (Node.js) functions. All our Lambda functions has a layer set through Globals. When we make breaking changes in the layer code we get errors during deployment because new Lambda functions are rolled out to production with old layer and after a few seconds (~40 seconds in our case) it starts using the new layer. For example, let's say we add a new class to the layer and we import it in the function code then we get an error that says NewClass is not found for a few seconds during deployment (this happens because new function code still uses old layer which doesn't have NewClass).

Is it possible to ensure new lambda function is always rolled out with the latest layer version?

Example CloudFormation template:

    Globals:
      Function:
        Runtime: nodejs14.x
        Layers:
          - !Ref CoreLayer
    
    Resources:
      CoreLayer:
        Type: AWS::Serverless::LayerVersion
        Properties:
          LayerName: core-layer
          ContentUri: packages/coreLayer/dist
          CompatibleRuntimes:
            - nodejs14.x
        Metadata:
          BuildMethod: nodejs14.x
    
      ExampleFunction:
        Type: AWS::Serverless::Function
        Properties:
          FunctionName: example-function
          CodeUri: packages/exampleFunction/dist

Example CloudFormation deployment events, as you can see new layer (CoreLayer123abc456) is created before updating the Lambda function so it should be available to use in the new function code but for some reasons Lambda is updated and deployed with old layer version for a few seconds:

Timestamp Logical ID Status Status reason
2022-05-23 16:26:54 stack-name UPDATE_COMPLETE -
2022-05-23 16:26:54 CoreLayer789def456 DELETE_SKIPPED -
2022-05-23 16:26:53 v3uat-farthing UPDATE_COMPLETE_CLEANUP_IN_PROGRESS -
2022-05-23 16:26:44 ExampleFunction UPDATE_COMPLETE -
2022-05-23 16:25:58 ExampleFunction UPDATE_IN_PROGRESS -
2022-05-23 16:25:53 CoreLayer123abc456 CREATE_COMPLETE -
2022-05-23 16:25:53 CoreLayer123abc456 CREATE_IN_PROGRESS Resource creation Initiated
2022-05-23 16:25:50 CoreLayer123abc456 CREATE_IN_PROGRESS -
2022-05-23 16:25:41 stack-name UPDATE_IN_PROGRESS User Initiated

Example changeset:


    {
      "resourceChange": {
        "logicalResourceId": "ExampleFunction",
        "action": "Modify",
        "physicalResourceId": "example-function",
        "resourceType": "AWS::Lambda::Function",
        "replacement": "False",
        "moduleInfo": null,
        "details": [
          {
            "target": {
              "name": "Environment",
              "requiresRecreation": "Never",
              "attribute": "Properties"
            },
            "causingEntity": "ApplicationVersion",
            "evaluation": "Static",
            "changeSource": "ParameterReference"
          },
          {
            "target": {
              "name": "Layers",
              "requiresRecreation": "Never",
              "attribute": "Properties"
            },
            "causingEntity": null,
            "evaluation": "Dynamic",
            "changeSource": "DirectModification"
          },
          {
            "target": {
              "name": "Environment",
              "requiresRecreation": "Never",
              "attribute": "Properties"
            },
            "causingEntity": null,
            "evaluation": "Dynamic",
            "changeSource": "DirectModification"
          },
          {
            "target": {
              "name": "Code",
              "requiresRecreation": "Never",
              "attribute": "Properties"
            },
            "causingEntity": null,
            "evaluation": "Static",
            "changeSource": "DirectModification"
          },
          {
            "target": {
              "name": "Layers",
              "requiresRecreation": "Never",
              "attribute": "Properties"
            },
            "causingEntity": "CoreLayer123abc456",
            "evaluation": "Static",
            "changeSource": "ResourceReference"
          }
        ],
        "changeSetId": null,
        "scope": [
          "Properties"
        ]
      },
      "hookInvocationCount": null,
      "type": "Resource"
    }

I didn't understand why it has 2 target.name: Layers items in details array. One of them has causingEntity: CoreLayer123abc456 which is expected due to newly created layer and the other has causingEntity: null, not sure why this is there.

Originally posted on AWS re:Post here

Edit:

After a couple of tests it turns out that the issue is caused by the order of the changes from changeset. Looks like changes are applied one by one. For example for the following changeset it updates the old function code while still using the old layer and then updates the function layer with the latest version because Layers change item comes after Code change item.


    {
      "resourceChange":{
        "logicalResourceId":"ExampleFunction",
        "action":"Modify",
        "physicalResourceId":"example-function",
        "resourceType":"AWS::Lambda::Function",
        "replacement":"False",
        "moduleInfo":null,
        "details":[
          {
            "target":{
              "name":"Layers",
              "requiresRecreation":"Never",
              "attribute":"Properties"
            },
            "causingEntity":null,
            "evaluation":"Dynamic",
            "changeSource":"DirectModification"
          },
          {
            "target":{
              "name":"Code",
              "requiresRecreation":"Never",
              "attribute":"Properties"
            },
            "causingEntity":null,
            "evaluation":"Static",
            "changeSource":"DirectModification"
          },
          {
            "target":{
              "name":"Layers",
              "requiresRecreation":"Never",
              "attribute":"Properties"
            },
            "causingEntity":"CoreLayer123abc456",
            "evaluation":"Static",
            "changeSource":"ResourceReference"
          }
        ],
        "changeSetId":null,
        "scope":[
          "Properties"
        ]
      },
      "hookInvocationCount":null,
      "type":"Resource"
    }

But in some deployments the order is the other way around, such as:


    {
      "resourceChange":{
        ...
        "details":[
          ...
          {
            "target":{
              "name":"Layers",
              "requiresRecreation":"Never",
              "attribute":"Properties"
            },
            "causingEntity":"CoreLayer123abc456",
            "evaluation":"Static",
            "changeSource":"ResourceReference"
          },
          {
            "target":{
              "name":"Code",
              "requiresRecreation":"Never",
              "attribute":"Properties"
            },
            "causingEntity":null,
            "evaluation":"Static",
            "changeSource":"DirectModification"
          }
        ],
        ...
    }

In this case it updates the old function with the latest layer version and then updates the function code with the updated one. So for a couple of seconds old code is invoked with the latest layer version.

So does it possible to apply all these changes in only one single step? Similar to Atomicity in databases


Solution

  • I solved this by using Lambda versions as suggested here. For each deployment I create a new Lambda version, so old and new Lambda use the correct core layer.

    Reposting my comment from the original post for reference:

    I added a lambda version:

    ExampleFunctionVersion:
      Type: AWS::Lambda::Version
      DeletionPolicy: Delete
      Properties:
        FunctionName: !Ref ExampleFunction
    

    and replaced all function references with !Ref ExampleFunctionVersion.

    I also tried AutoPublishAlias: live, it worked too but I had to add an AWS::Lambda::Version resource to be able to set DeletionPolicy: Delete