Search code examples
pythonpandasdataframestack

How to stack several column into one column but keep a column as it is in python


Let say I have this dataframe

A B C
1 p q
1 q
2 p
2 p q

and I want to stack column B and C into one column (let say column D), But keep column A as it is. How to get this?

Here the expected output I want

A D
1 p
1 q
1 q
2 p
2 p
2 q

Note: As far as I know, to stack column B and C into column D I can use this code:

df_temp = df[['B', 'C']]
df_temp = df_temp.stack().reset_index(name='D')

Solution

  • Use DataFrame.set_index with column A:

    df_temp = (df.set_index('A')[['B','C']]
                  .stack()
                  .reset_index(level=1, drop=True)
                  .reset_index(name='D'))
    print (df_temp)
       A  D
    0  1  p
    1  1  q
    2  1  q
    3  2  p
    4  2  p
    5  2  q
    

    Or:

    df_temp = (df.melt(id_vars='A',value_vars=['B','C'], value_name='D')
                 .drop('variable', axis=1)
                 .dropna(subset=['D'])
                 .sort_values('A', ignore_index=False))
    print (df_temp)
       A  D
    0  1  p
    4  1  q
    5  1  q
    2  2  p
    3  2  p
    7  2  q