Search code examples
pandasdataframepivot-tablepandas-melt

Pandas pivot columns based on column name prefix


I have a dataframe:

df = AG_Speed AG_wolt AB_Speed AB_wolt C1 C2 C3
       1         2      3         4     6  7  8
       1         9      2         6     4  1  8

And I want to pivot it based on prefix to get:

df = Speed Wolt C1 C2 C3 Category
      1      2   6 7  8    AG
      3      4   6 7  8    AB
      1      9   4 1  8    AG
      2      6   4 1  8    AG 

What is the best way to do it?


Solution

  • We can use pd.wide_to_long for this. But since it expects the column names to start with the stubnames, we have to reverse the column format:

    df.columns = ["_".join(col.split("_")[::-1]) for col in df.columns]
    res = pd.wide_to_long(
        df, 
        stubnames=["Speed", "wolt"], 
        i=["C1", "C2", "C3"], 
        j="Category", 
        sep="_", 
        suffix="[A-Za-z]+"
    ).reset_index()
    
       C1  C2  C3 Category  Speed  wolt
    0   6   7   8       AG      1     2
    1   6   7   8       AB      3     4
    2   4   1   8       AG      1     9
    3   4   1   8       AB      2     6
    

    If you want the columns in a specific order, use DataFrame.reindex:

    res.reindex(columns=["Speed", "wolt", "C1", "C2", "C3", "Category"])
    
       Speed  wolt  C1  C2  C3 Category
    0      1     2   6   7   8       AG
    1      3     4   6   7   8       AB
    2      1     9   4   1   8       AG
    3      2     6   4   1   8       AB