The column names of my spark dataframe df are: A_x1, A_x2, B_x1, B_x2, C_x1, C_x2.
How do I create 3 new spark dataframes from df by using the prefixes? The output should look like this:
Thank you!
You can use colRegex
to filter the columns:
A_ = df.select(df.colRegex('`A_.*`'))
B_ = df.select(df.colRegex('`B_.*`'))
C_ = df.select(df.colRegex('`C_.*`'))