Search code examples
pythonamazon-web-servicespysparkaws-glue

How to Retrieve a field value from a Glue DynamicFrame by name


In a Spark DataFrame you can address a column's value in the schema by using its name like df['personId'] - but that way does not work with Glue's DynamicFrame. Is there a similar way, without converting the DynamicFrame to a DataFrame, to directly access a columns values by name?


Solution

  • You can use select_fields, see https://docs.aws.amazon.com/glue/latest/dg/aws-glue-api-crawler-pyspark-transforms-SelectFields.html.

    In your case it would be df.select_fields("personId"). Depending on what you want to do, you can save it as a new dynamic frame or just look at the data.

    new_frame = df.select_fields("personId")
    new_frame.show()