Search code examples
pandasindexingpivot-tablepandas-melt

How can I combine 2 levels index into one in pandas?


I am trying to extract only the relevant information from a dataframe. My data looks like

import pandas as pd
import numpy as np

df = pd.DataFrame({'ID': {0: 'id1', 1: 'id1', 2: 'id1'},
                   'EM': {0: 'met1', 1: 'met2', 2: 'met3'},
                   'met1_AVG': {0: 0.38, 1: np.nan, 2: np.nan},
                   'met2_AVG': {0: np.nan, 1: 0.2, 2: np.nan},
                   'met3_AVG': {0: np.nan, 1: np.nan, 2: 0.58},
                   'score': {0: 89, 1: 89, 2: 89}})

My desired output is this is my desired output

Please, find my code below. I really would appreciate if someone could help me out. Thank you in advance for your time and helpful assistance

df_melted = df.melt(id_vars=['ID','EM','score']).dropna(subset=['value'])
df_pivoted = pd.pivot_table(data=df_melted,index=['ID','score'],columns=['variable'])
df_ready = df_pivoted.reset_index()
df_ready

Solution

  • Assuming the score is always same, you can use pandas.DataFrame.groupby.first:

    df.drop("EM",axis=1).groupby("ID", as_index=False).first()
    

    Output:

        ID  met1_AVG  met2_AVG  met3_AVG  score
    0  id1      0.38       0.2      0.58     89