We are running a larger application on top of the Serverless framework and keep growing our modules and user-base. All our backend code is sitting in a mono-repo and is compiled from 29 Serverless services (microservices) and consists of close to 300 REST and WebSocket endpoints.
We are deploying from within Github Actions and did not have any issues up until recently (last few months, weeks).
Unfortunately our code never gets called which makes the error also invisible to our own telemetry system.
When sls deploy
is executed from Github Actions "randomly" 1 or 2 functions seem to get deployed as per the log files but when we call them a Cannot find module
error is logged.
The error is not related to imports, as we can survey from the stacktrace below it is actually the Serverless function itself which cannot be found in the function bundle.
2023-10-25T09:52:03.202Z undefined ERROR Uncaught Exception
{
"errorType": "Runtime.ImportModuleError",
"errorMessage": "Error: Cannot find module 's_putAnsweredQuestionnaire'\nRequire stack:\n- /var/runtime/UserFunction.js\n- /var/runtime/Runtime.js\n- /var/runtime/index.js",
"stack": [
"Runtime.ImportModuleError: Error: Cannot find module 's_putAnsweredQuestionnaire'",
"Require stack:",
"- /var/runtime/UserFunction.js",
"- /var/runtime/Runtime.js",
"- /var/runtime/index.js",
" at _loadUserApp (/var/runtime/UserFunction.js:225:13)",
" at Object.module.exports.load (/var/runtime/UserFunction.js:300:17)",
" at Object.<anonymous> (/var/runtime/index.js:43:34)",
" at Module._compile (internal/modules/cjs/loader.js:1114:14)",
" at Object.Module._extensions..js (internal/modules/cjs/loader.js:1143:10)",
" at Module.load (internal/modules/cjs/loader.js:979:32)",
" at Function.Module._load (internal/modules/cjs/loader.js:819:12)",
" at Function.executeUserEntryPoint [as runMain] (internal/modules/run_main.js:75:12)",
" at internal/main/run_main_module.js:17:47"
]
}
sls --version
Running "serverless" from node_modules
Framework Core: 3.22.0 (local) 3.21.0 (global)
Plugin: 6.2.2
SDK: 4.3.2
Target: node14
(still)
Plugins:
Github Action is also running with node14
.
The logs in Github Actions do not show any related errors. Actually Github Actions would fail on any deploy problems and CloudFormation should (and already did in the past) roll back the failed stack (service).
Logs of CloudFormation, for a broken deployment, do not show any errors whatsoever. It looks like the stacks deployed just as usually.
Deployment from developer machines: If we specifically deploy a broken service again from a developer machine (checkout the tag and sls deploy
with correct stage params etc.) it works 100% of the times. This is basically the recovery routine we do right now.
The Serverless Dashboard also does not show any failed deployments around the affected services.
Has someone experienced the same issue?
Given the size of our application (30 services, 300 functions) could we have hit some limitations within AWS that are just not apparent to us?
Where can we continue looking?
This problem could be resolved by moving to the now current 3.38.0
version of the Serverless framework and in parallel moving to node 16.20.0
. This of course required us to move to the nodejs16.x
target in Serverless.
Since these upgrades deployments are stable again.
Unfortunately we were not able to move to the nodejs18.x
target due to our aws-sdk
dependencies. Using Node 18 requires AWS-SDKv3.