Search code examples
python-3.xamazon-web-servicesaws-glue

Get value of column in AWS Glue Custom Transform


I'm working on ETL in AWS Glue. I need to decode text from table which is in base64 - I'm doing that in Custom Transform in Python3.

My code is below:

def MyTransform (glueContext, dfc) -> DynamicFrameCollection:
import base64
    
newdf = dfc.select(list(dfc.keys())[0]).toDF()

data = newdf["email"]

data_to_decrypt = base64.b64decode(data)

I've got error like that:

TypeError: argument should be a bytes-like object or ASCII string, not 'Column'

How to get plan string from the Column object?


Solution

  • I was wrong and it was completely different thing than I thought.

    Column object from newdf["email"] consists all rows for this single column, so it's not possible to just fetch one value from that.

    What I ended up doing is iterating through whole rows and mapping them to new value like that:

    def map_row(row):
        id = row.id
        client_key = row.client_key
        email = decrypt_jasypt_string(row.email.strip())
        phone = decrypt_jasypt_string(row.phone.strip())
        created_on = row.created_on
        return (id, email, phone, created_on, client_key)
    
    df = dfc.select(list(dfc.keys())[0]).toDF()
    
    rdd2=df.rdd.map(lambda row: map_row(row))
    df2=rdd2.toDF(["id","email","phone", "created_on", "client_key"])
    
    dyf_filtered = DynamicFrame.fromDF(df2, glueContext, "does it matter?")