How to diagnose inconsistent S3 permission errors

I'm running a Python script in an AWS Lambda function. It is triggered by SQS messages that tell the script certain objects to load from an S3 bucket for further processing.

The permissions seem to be set up correctly, with a bucket policy that allows the Lambda's execution role to do any action on any object in the bucket. And the Lambda can access everything most of the time. The objects are being loaded via pandas and s3fs: pandas.read_csv(f's3://{s3_bucket}/{object_key}').

However, when a new object is uploaded to the S3 bucket, the Lambda can't access it at first. The botocore SDK throws An error occurred (403) when calling the HeadObject operation: Forbidden when trying to access the object. Repeated invocations (even 50+) of the Lambda over several minutes (via SQS) give the same error. However, when invoking the Lambda with a different SQS message (that loads different objects from S3), and then re-invoking with the original message, the Lambda can suddenly access the S3 object (that previously failed every time). All subsequent attempts to access this object from the Lambda then succeed.

I'm at a loss for what could cause this. This repeatable 3-step process (1) fail on newly-uploaded object, 2) run with other objects 3) succeed on the original objects) can happen all on one Lambda container (they're all in one CloudWatch log stream, which seems to correlate with Lambda containers). So, it doesn't seem to be from needing a fresh Lambda container/instance.

Thoughts or ideas on how to further debug this?

Solution

Amazon S3 is an object storage system, not a filesystem. It is accessible via API calls that perform actions like GetObject, PutObject and ListBucket.

Utilities like s3fs allow an Amazon S3 bucket to be 'mounted' as a file system. However, behind the scenes s3fs makes normal API calls like any other program would.

This can sometimes (often?) lead to problems, especially where files are being quickly created, updated and deleted. It can take some time for s3fs to update S3 to match what is expected from a local filesystem.

Therefore, it is not recommended to use tools like s3fs to 'mount' S3 as a filesystem, especially for Production use. It is better to call the AWS API directly.