Search code examples
amazon-web-servicesapache-sparkamazon-s3pysparkaws-glue

com.amazonaws.services.gluejobexecutor.model.InternalServiceException: Item size to update has exceeded the maximum allowed size


I'm using AWS Glue as an ETL Job to transform data from S3 and writing to another S3 bucket using bookmarking.

I am receiving this unexpected exception on a scheduled Job which was running without any problematic until the previous day:

Traceback (most recent call last):
  File "my_script.py", line 123, in <module>
job.commit()

[...]

py4j.protocol.Py4JJavaError: An error occurred while calling z:com.amazonaws.services.glue.util.Job.commit.
: com.amazonaws.services.gluejobexecutor.model.InternalServiceException: Item size to update has exceeded the maximum allowed size 
(Service: AmazonDynamoDBv2; Status Code: 400; Error Code: ValidationException; Request ID: /) 
(Service: AWSGlueJobExecutor; Status Code: 500; Error Code: InternalServiceException; Request ID: /)

I see in the error message is mentioned Dynamo DB but I'm not using at all this service (so I suspect it is internally used in Glue).

What is causing this exception?


Solution

  • I Reset job bookmark from the Glue Console and the problem was solved, succeeding the Job execution