In a Spark DataFrame you can address a column's value in the schema by using its name like df['personId']
- but that way does not work with Glue's DynamicFrame. Is there a similar way, without converting the DynamicFrame to a DataFrame, to directly access a columns values by name?
You can use select_fields
, see
https://docs.aws.amazon.com/glue/latest/dg/aws-glue-api-crawler-pyspark-transforms-SelectFields.html.
In your case it would be df.select_fields("personId")
. Depending on what you want to do, you can save it as a new dynamic frame or just look at the data.
new_frame = df.select_fields("personId")
new_frame.show()