I am using a function to turn all whitespaces in a pyspark dataframe into single whitespaces. I am able to apply this function individually to seperate columns using .withcolumn. Now, I have around 120 columns of mixed types and I would like to apply this function only to the string columns. For that, I created a list containing only the string typed column names. How do I feed (apply, map ?) this array to my function using withcolumn?
import quinn
#example data
data = {
'fruits': ["apples", " banana", "cherry"],
'veggies': [1, 0, 1],
'meat': ["pig", "cow", " chicken "]}
df = pd.DataFrame(data)
ddf = spark.createDataFrame(df)
mylist_column= [item[0] for item in df.dtypes if item[1].startswith('string')]
df= df.withColumn('fruits', quinn.single_space('fruits'))
for element in mylist_column:
ddf= ddf.withColumn(element, quinn.single_space(element))