Search code examples
pythonpandasexplode

Exploding pandas dataframe on multiple columns


I am trying to explode a dataframe based on multiple dataframes (infact on a condition where there is a comma in the data anywhere in that column), but end up getting errors when I try on multiple columns. I have tried split(), explode() etc.

Input enter image description here

Expected output: enter image description here


Solution

  • def pad_list(lst, max_len):
        return lst + [lst[-1]] * (max_len - len(lst))
    
    new_df = df.apply(lambda row: row.apply(lambda item: pad_list(item.split(','), max(row.str.contains(',') + 1))), axis=1).explode(df.columns.tolist())
    

    Output:

    >>> new_df
        Cat1   Cat2    Cat3    Cat4    Cat5
    0    Cat  table     Pen  Toyota  Sydney
    0    Cat  chair     Pen   Honda   Perth
    1  Horse  Stand  Pencil  Holden    Bali
    1  Horse  Stand  eraser  Holden    Bali