I'm working on ETL in AWS Glue. I need to decode text from table which is in base64 - I'm doing that in Custom Transform in Python3.
My code is below:
def MyTransform (glueContext, dfc) -> DynamicFrameCollection:
import base64
newdf = dfc.select(list(dfc.keys())[0]).toDF()
data = newdf["email"]
data_to_decrypt = base64.b64decode(data)
I've got error like that:
TypeError: argument should be a bytes-like object or ASCII string, not 'Column'
How to get plan string from the Column object?
I was wrong and it was completely different thing than I thought.
Column object from newdf["email"] consists all rows for this single column, so it's not possible to just fetch one value from that.
What I ended up doing is iterating through whole rows and mapping them to new value like that:
def map_row(row):
id = row.id
client_key = row.client_key
email = decrypt_jasypt_string(row.email.strip())
phone = decrypt_jasypt_string(row.phone.strip())
created_on = row.created_on
return (id, email, phone, created_on, client_key)
df = dfc.select(list(dfc.keys())[0]).toDF()
rdd2=df.rdd.map(lambda row: map_row(row))
df2=rdd2.toDF(["id","email","phone", "created_on", "client_key"])
dyf_filtered = DynamicFrame.fromDF(df2, glueContext, "does it matter?")