I have a glue job that is writing a dynamicframe to a CSV in s3 but for some reason the nulls are being removed. What is a good solution to resolve this?
Desired Output in CSV:
user_id, example_assignment, example_product
null, null, null
null, llama, null
null, null, feed
Current Output in CSV:
user_id, example_assignment, example_product
,,
,llama,
,,feed
Glue Write Csv:
glueContext.getSinkWithFormat(
connectionType = "s3",
options = example_path,
transformationContext = "example_transformation",
format = "csv"
).writeDynamicFrame(exampleDF)
So currently Glue does not provide an emptyValues option like Spark does on write.
You could either use the Spark API directly, or fill those empty values beforehand with, for example, the aforementioned FillMissingValue Class from Glue.