I have a column called "trigger_piece", it is organized in below way: ["arc", "erf", "erear"]
I'm looking to create new columns (in this case 3 new columns eg: trigger_1, trigger_2, trigger_3) that would extract "arc", "era", "erear".
I have tried using str.split, however it seems like that could be a pretty manual process since we don't know how many trigger extractions there can be. In above example we have 3 but it's also possible to have 4 or 5.
you can use below code.
Below are the data having 3,4 and 5 variable lengths of array.
Code:
max_triggers = df.selectExpr("size(trigger_piece)").agg({"size(trigger_piece)": "max"}).collect()[0]["max(size(trigger_piece))"]
print(max_triggers)
exprs = [col("trigger_piece")[i].alias(f"trigger_{i+1}") for i in range(max_triggers)]
result_df = df.select("*", *exprs)
display(result_df)
Output:
Whatever the size it creates new column with it, and for lesser triggers it assigns nulls.