I am hoping if someone can help here. Below is part of script our developer used in Azure Synapse to read and create TempView from parquet file. It reads parquet file "Client" and create a TempView: aa_client:
aa_client = read_from_data_lake(aa_account_name, aa_file_system_name, aa_file_location + "Client", aa_file_format, aa_header, aa_infer_schema)
aa_client.createOrReplaceTempView("aa_client")
aa_client = aa_client.withColumn("client_group",when(aa_client.client_group=="UGR","UG"))
aa_client = aa_client.withColumn("last_update_timestamp",lit(None))
The first 2 lines of codes are working, but I was just wondering if I could also add the following:
Many Thanks Michelle
I reproduced the above and able to update the dataframe.
This is my sample data in dataframe df
:
I used the below code, and you can see the values are updated in the result dataframe.
aa_client = df.withColumn("client_group",when(df.client_group=="UGR","UG").otherwise(df["client_group"]))
aa_client = aa_client.withColumn("last_update_timestamp",lit(None))
I have used otherwise
after when in the first line to not update other values to null if the condition is not matched.
Result:
If you want to update your temporary view, you have create or Replace the temporary view again after the above operations. dataframe operations won't update the views.