New to pyspark. Just trying to simply loop over columns that exist in a variable list. This is what I've tried, but doesn't work.
column_list = ['colA','colB','colC']
for col in df:
if col in column_list:
df = df.withColumn(...)
else:
pass
It's definitely an issue with the loop. I feel like I'm missing something really simple here. I performed the df operation independently on each column and it ran clean ie.
df = df.withColumn(...'colA').withColumn(...'colB').withColumn(...'colC')
Use the following snippet
column_list = ['colA','colB','colC']
for col in df.columns:
if col in column_list:
df = df.withColumn(....)
else:
pass