we use CloudFormation and SAM to deploy our Lambda (Node.js) functions. All our Lambda functions has a layer set through Globals
. When we make breaking changes in the layer code we get errors during deployment because new Lambda functions are rolled out to production with old layer and after a few seconds (~40 seconds in our case) it starts using the new layer. For example, let's say we add a new class to the layer and we import it in the function code then we get an error that says NewClass is not found
for a few seconds during deployment (this happens because new function code still uses old layer which doesn't have NewClass
).
Is it possible to ensure new lambda function is always rolled out with the latest layer version?
Example CloudFormation template:
Globals:
Function:
Runtime: nodejs14.x
Layers:
- !Ref CoreLayer
Resources:
CoreLayer:
Type: AWS::Serverless::LayerVersion
Properties:
LayerName: core-layer
ContentUri: packages/coreLayer/dist
CompatibleRuntimes:
- nodejs14.x
Metadata:
BuildMethod: nodejs14.x
ExampleFunction:
Type: AWS::Serverless::Function
Properties:
FunctionName: example-function
CodeUri: packages/exampleFunction/dist
Example CloudFormation deployment events, as you can see new layer (CoreLayer123abc456
) is created before updating the Lambda function so it should be available to use in the new function code but for some reasons Lambda is updated and deployed with old layer version for a few seconds:
Timestamp | Logical ID | Status | Status reason |
---|---|---|---|
2022-05-23 16:26:54 | stack-name | UPDATE_COMPLETE | - |
2022-05-23 16:26:54 | CoreLayer789def456 | DELETE_SKIPPED | - |
2022-05-23 16:26:53 | v3uat-farthing | UPDATE_COMPLETE_CLEANUP_IN_PROGRESS | - |
2022-05-23 16:26:44 | ExampleFunction | UPDATE_COMPLETE | - |
2022-05-23 16:25:58 | ExampleFunction | UPDATE_IN_PROGRESS | - |
2022-05-23 16:25:53 | CoreLayer123abc456 | CREATE_COMPLETE | - |
2022-05-23 16:25:53 | CoreLayer123abc456 | CREATE_IN_PROGRESS | Resource creation Initiated |
2022-05-23 16:25:50 | CoreLayer123abc456 | CREATE_IN_PROGRESS - | |
2022-05-23 16:25:41 | stack-name | UPDATE_IN_PROGRESS | User Initiated |
Example changeset:
{
"resourceChange": {
"logicalResourceId": "ExampleFunction",
"action": "Modify",
"physicalResourceId": "example-function",
"resourceType": "AWS::Lambda::Function",
"replacement": "False",
"moduleInfo": null,
"details": [
{
"target": {
"name": "Environment",
"requiresRecreation": "Never",
"attribute": "Properties"
},
"causingEntity": "ApplicationVersion",
"evaluation": "Static",
"changeSource": "ParameterReference"
},
{
"target": {
"name": "Layers",
"requiresRecreation": "Never",
"attribute": "Properties"
},
"causingEntity": null,
"evaluation": "Dynamic",
"changeSource": "DirectModification"
},
{
"target": {
"name": "Environment",
"requiresRecreation": "Never",
"attribute": "Properties"
},
"causingEntity": null,
"evaluation": "Dynamic",
"changeSource": "DirectModification"
},
{
"target": {
"name": "Code",
"requiresRecreation": "Never",
"attribute": "Properties"
},
"causingEntity": null,
"evaluation": "Static",
"changeSource": "DirectModification"
},
{
"target": {
"name": "Layers",
"requiresRecreation": "Never",
"attribute": "Properties"
},
"causingEntity": "CoreLayer123abc456",
"evaluation": "Static",
"changeSource": "ResourceReference"
}
],
"changeSetId": null,
"scope": [
"Properties"
]
},
"hookInvocationCount": null,
"type": "Resource"
}
I didn't understand why it has 2 target.name: Layers
items in details
array. One of them has causingEntity: CoreLayer123abc456
which is expected due to newly created layer and the other has causingEntity: null
, not sure why this is there.
Originally posted on AWS re:Post here
Edit:
After a couple of tests it turns out that the issue is caused by the order of the changes from changeset. Looks like changes are applied one by one. For example for the following changeset it updates the old function code while still using the old layer and then updates the function layer with the latest version because Layers
change item comes after Code
change item.
{
"resourceChange":{
"logicalResourceId":"ExampleFunction",
"action":"Modify",
"physicalResourceId":"example-function",
"resourceType":"AWS::Lambda::Function",
"replacement":"False",
"moduleInfo":null,
"details":[
{
"target":{
"name":"Layers",
"requiresRecreation":"Never",
"attribute":"Properties"
},
"causingEntity":null,
"evaluation":"Dynamic",
"changeSource":"DirectModification"
},
{
"target":{
"name":"Code",
"requiresRecreation":"Never",
"attribute":"Properties"
},
"causingEntity":null,
"evaluation":"Static",
"changeSource":"DirectModification"
},
{
"target":{
"name":"Layers",
"requiresRecreation":"Never",
"attribute":"Properties"
},
"causingEntity":"CoreLayer123abc456",
"evaluation":"Static",
"changeSource":"ResourceReference"
}
],
"changeSetId":null,
"scope":[
"Properties"
]
},
"hookInvocationCount":null,
"type":"Resource"
}
But in some deployments the order is the other way around, such as:
{
"resourceChange":{
...
"details":[
...
{
"target":{
"name":"Layers",
"requiresRecreation":"Never",
"attribute":"Properties"
},
"causingEntity":"CoreLayer123abc456",
"evaluation":"Static",
"changeSource":"ResourceReference"
},
{
"target":{
"name":"Code",
"requiresRecreation":"Never",
"attribute":"Properties"
},
"causingEntity":null,
"evaluation":"Static",
"changeSource":"DirectModification"
}
],
...
}
In this case it updates the old function with the latest layer version and then updates the function code with the updated one. So for a couple of seconds old code is invoked with the latest layer version.
So does it possible to apply all these changes in only one single step? Similar to Atomicity in databases
I solved this by using Lambda versions as suggested here. For each deployment I create a new Lambda version, so old and new Lambda use the correct core layer.
Reposting my comment from the original post for reference:
I added a lambda version:
ExampleFunctionVersion: Type: AWS::Lambda::Version DeletionPolicy: Delete Properties: FunctionName: !Ref ExampleFunction
and replaced all function references with
!Ref ExampleFunctionVersion
.I also tried
AutoPublishAlias: live
, it worked too but I had to add anAWS::Lambda::Version
resource to be able to setDeletionPolicy: Delete