Search code examples
pythonpandasdataframeunpivotsuffix

Remove suffix of the column names and unpivot


I'd like to unpivot the following table with column names "Year", "Item", and "$". My workaround is to separate the table into two dataframes and remove the suffixes, then concatenate the two columns vertically. Are there any other easier ways to approach this?

Example Dataframe:

data = {'Year_x': [1993, 1994, 1995, 1996], 
       'Year_y': [2000, 2001, 2002, 2003],
       'Item_x':['A','B','C','D'],
       'Item_y':['E','F','G','H'],
       '$':[3,4,5,6]}

pd.DataFrame.from_dict(data)
Year_x Year_y Item_x Item_y $
1993 2000 A E 3
1994 2001 B F 4
1995 2002 C G 5
1996 2003 D H 6

What I want to achieve:

Year Item $
1993 A 3
1994 B 4
1995 C 5
1995 D 6
2000 E 3
2001 F 4
2002 G 5
2003 H 6

Solution

  • Concatenation with list comprehension

    >>> pd.concat([df[["Year_" + c, "Item_" + c, "$"]].rename({"Year_" + c : "Year", "Item_" + c : "Item"}, axis=1) for c in ("x", "y")]).reset_index(drop=True)
       Year Item  $
    0  1993    A  3
    1  1994    B  4
    2  1995    C  5
    3  1996    D  6
    4  2000    E  3
    5  2001    F  4
    6  2002    G  5
    7  2003    H  6
    

    Or, via pd.wide_to_long(...)

    >>> pd.wide_to_long(df, ["Year", "Item"], i=["$"], j="Var", sep="_", suffix="\w+").reset_index()
       $ Var  Year Item
    0  3   x  1993    A
    1  4   x  1994    B
    2  5   x  1995    C
    3  6   x  1996    D
    4  3   y  2000    E
    5  4   y  2001    F
    6  5   y  2002    G
    7  6   y  2003    H