I am new to a AWS Glue and wanted your help in doing a very simple transformation. I am trying to learn AWS Glue
Below is my data. I want to add a new column in the target dataset that if the Movie rating is above 5 show 'Yes' else 'No'. The Movie_Id & User_id combo is unique field in the data set.
my data
id movie_id user_id rating
1 abc xyx 10
2 csd xyx 8
3 abc sss 3
4 csd sss 5
Result
id movie_id user_id rating Yes/No
1 abc xyx 10 Yes
2 csd xyx 8 Yes
3 abc sss 3 No
4 csd sss 5 No
This can be done using an UDF something similar as shown below. You can read more about it here.
def deriveBool(rec):
if rec["rating"] > 5 :
rec["Yes/No"] = 'Yes'
else:
rec["Yes/No"] = 'No'
return rec
datasource_mapped = Map.apply(frame = datasource0, f = deriveBool, transformation_ctx = "deriveboolvalues")