Search code examples
aws-lambdaredisamazon-vpcamazon-elasticache

AWS Lambda Timeout when connecting to Redis Elasticache in same VPC


Trying to publish from a Lambda function to a Redis Elasticache, but I just continue to get 502 Bad Gateway responses with the Lambda function timing out.

I have successfully connected to the Elasticache instance using an ECS in the same VPC which leads me to think that the VPC settings for my Lambda are not correct. I tried following this tutorial (https://docs.aws.amazon.com/lambda/latest/dg/services-elasticache-tutorial.html) and have looked at several StackOverflow threads to no avail.

The Lambda Function:

export const publisher = redis.createClient({
  url: XXXXXX, // env var containing the URL which is also used in the ECS server to successfully connect
});

export const handler = async (
  event: AWSLambda.APIGatewayProxyWithCognitoAuthorizerEvent
): Promise<AWSLambda.APIGatewayProxyResult> => {
  try {
    if (!event.body || !event.pathParameters || !event.pathParameters.channelId)
      return ApiResponse.error(400, {}, new InvalidRequestError());

    const { action, payload } = JSON.parse(event.body) as {
      action: string;
      payload?: { [key: string]: string };
    };

    const { channelId } = event.pathParameters;

    const publishAsync = promisify(publisher.publish).bind(publisher);

    await publishAsync(
      channelId,
      JSON.stringify({
        action,
        payload: payload || {},
      })
    );

    return ApiResponse.success(204);
  } catch (e) {
    Logger.error(e);
    return ApiResponse.error();
  }
};

In my troubleshooting, I have verified the following in the Lambda functions console:

  • The correct role is showing in Configuration > Permissions
  • The lambda function has access to the VPC (Configuration > VPCs), Subnets, and the same SG as the Elasticache instance.
  • The SG is allowing all traffic from anywhere.
  • It is indeed the Redis connection. Using console.log the code stops at this line: await publishAsync()

I am sure it is something small, but it is racking my brain!

Update 1:

Tried adding an error handler to log any issues with the publish in addition to the main try/catch block, but it's not logging a thing.

publisher.on('error', (e) => {
    Logger.error(e, 'evses-service', 'message-publisher');
  });

Also have copied my Elasticache setup: enter image description here

And my Elasticache Subnet Group: enter image description here

And my Lambda VPC settings: enter image description here

And that my Lambda has the right access: enter image description here

Update 2:

Tried to follow the tutorial here (https://docs.aws.amazon.com/lambda/latest/dg/services-elasticache-tutorial.html) word for word, but getting the same issue. No logs, just a timeout after 30 seconds. Here is the test code:

const crypto = require('crypto');
const redis = require('redis');
const util = require('util');

const client = redis.createClient({
  url: 'rediss://clusterforlambdatest.9nxhfd.0001.use1.cache.amazonaws.com',
});

client.on('error', (e) => {
  console.log(e);
});

exports.handler = async (event) => {
  try {
    const len = 24;

    const randomString = crypto
      .randomBytes(Math.ceil(len / 2))
      .toString('hex') // convert to hexadecimal format
      .slice(0, len)
      .toUpperCase();

    const setAsync = util.promisify(client.set).bind(client);
    const getAsync = util.promisify(client.get).bind(client);

    await setAsync(randomString, 'We set this string bruh!');
    const doc = await getAsync(randomString);

    console.log(`Successfully receieved document ${randomString} with contents: ${doc}`);

    return;
  } catch (e) {
    console.log(e);

    return {
      statusCode: 500,
    };
  }
};

Solution

  • If you have timeout, assuming the lambda network is well configured, you should check the following:

    • redis SSL configuration: check diffs between redisS connection url and cluster configuration (in-transit encryption and client configuration with tls: {})
    • configure the client with a specific retry strategy to avoid lambda timeout and catch connection issue
    • check VPC acl and security groups