Search code examples
pythonpython-3.xamazon-web-servicesamazon-s3boto3

Lambda hangs while uploading to S3, while uploading from a local server works just fine


The lambda is in a public subnet, but the S3 bucket is public regardless.

The lambda has the FullS3Access IAM role, and I tried making an endpoint for S3 in my VPC, to no avail.

The bucket's only custom permission is a single statement that allows everyone to get objects from it.

This is the piece of code where it hangs:

s3 = boto3.client(
        "s3",
        region_name=variables.secrets.AWS_REGION_NAME,
        aws_access_key_id=variables.secrets.AWS_ACCESS_KEY_ID,
        aws_secret_access_key=variables.secrets.AWS_SECRET_ACCESS_KEY,
    )
    logger.debug("Connected to boto client")

    bucket_name = "user-images"
    file_name = f"public_{uuid4()}.{image.filename.split('.')[-1]}"

    try:
        logger.debug("Uploading file to s3...")
        s3.upload_fileobj(
            image.file,
            bucket_name,
            file_name,
            ExtraArgs={
                "ContentType": image.content_type,
            },
        )
        logger.debug("File uploaded to s3")
    except Exception as e:
        logger.error(str(e))
        raise HTTPException(
            status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
            detail=str(e),
        )

The last debug statement that I get is "Uploading file to S3..." after which it hangs indefinitely.

The file is 60 kb so it shouldn't be a bandwidth or hardware related issue.

And perhaps most importantly, running the server on localhost uploads fine to S3 without any issues.

Why is this happening?


Solution

  • If your Lambda function does not strictly need access to private resources in the VPC then don't configure it for VPC. That way you have an automatic route to the internet and AWS service endpoints (this is actually done by the AWS Lambda service, because your Lambda function is technically running on a compute instance within a Lambda service VPC and that service VPC has routing and NAT built in).

    If your Lambda function does actually need access to private resources in the VPC, e.g. an RDS database or Elasticsearch cluster, then configure your Lambda function for VPC and place it in private subnet(s). Note that HA best practice is multiple private subnets, spanning multiple AZs for availability. Do not use public subnets - your Lambda function has no public IP so any traffic from it towards a public endpoint will be dropped at the IGW.

    If your Lambda function needs to access AWS service endpoints then you can do that via VPC Endpoints, or via a subnet route to a NAT device (or NAT Gateway) in a public subnet of your VPC (which must have IGW, of course).