I've written an AWS glue job ETL script in python, and I'm looking for the proper way to perform conditional writes to the DynamoDb table I'm using as the target.
# Write to DynamoDB
glueContext.write_dynamic_frame_from_options(
frame=SelectFromCollection_node1665510217343,
connection_type="dynamodb",
connection_options={
"dynamodb.output.tableName": args["OUTPUT_TABLE_NAME"]
}
)
My script is writing to dynamo with write_dynamic_frame_from_options
. The aws glue connection parameter docs make no mention of the ability to customize the write behavior in the connection options.
Is there a clean way to write conditionally without using boto?
You cannot do conditional updates with the EMR DynamoDB connector which Glue uses. It does a complete overwrite of the data. For that you would have to use Boto3 and distribute it using forEachPartition
across the Spark executors.