Search code examples
pythonamazon-web-servicesaws-lambdaboto3

AWS Lambda Async Invocation Issue: Function getting timed out intermittently when invoking another Lambda asynchronously


I am attempting to invoke an AWS lambda function asynchronously within another Lambda function using the boto3 SDK. The invocation is done using the following code snippet:

lambda_client = boto3.client('lambda')
response = lambda_client.invoke(
    FunctionName='async_function:alias', InvocationType="Event",
    Payload=json.dumps({'id': '101932092', 'type': 'type', 'sub_type': 'subtype'})
)

The issues I am encountering is that the invoking function sometimes times out(15 minutes) at the above code block. The behavior occurs intermittently and there are no clear patterns.

I have ruled out concurrency and throttling issues on the invoked function by checking the relevant metrics. However, even though the invoke call is supposed to put the event in an event queue for asynchronous processing (as per AWS Lambda documentation), the invoking function times out without providing a success or error response.

Any insights or suggestions for trouble shooting this would be greatly appreciated.


Solution

  • The most likely reason for this intermittent connectivity is that your Lambda function has been configured for VPC access and you have chosen a mix of private and public subnets.

    The fix is to configure the Lambda function for private subnets only or, if your Lambda functions only need to reach AWS services, then configure VPC Endpoints for the AWS services that you need access to.

    The reason that the Lambda function fails intermittently is that it runs in a private subnet sometimes and in a public subnet at other times, depending on placement decisions made by the Lambda service. When the Lambda function executes in a public subnet, it has no network route to the internet or to AWS services. The reasons for this are:

    1. the Lambda function has a private IP but does not have a public IP
    2. the default route for traffic in a public subnet is the Internet Gateway, which drops traffic from private IPs (because they're not routable on the internet)
    3. the default route for traffic in a private subnet, if you set it up correctly to reach the internet, is a NAT or NAT gateway which allows private IP traffic to be NATed to a public IP (the public IP of the NAT device) and hence that traffic can reach the internet

    Also, see Why can't an AWS lambda function inside a public subnet in a VPC connect to the internet?