Search code examples
pythonpandasdataframemulti-index

Flatten multiindex dataframe levels and remove string from end of column names if contains


I have a dataframe like this

df = pd.DataFrame(
np.arange(2, 11).reshape(-1, 3),
index=list('ABC'),
columns=pd.MultiIndex.from_arrays([
    ['data1', 'data2','data3'],
    ['F', 'K',''],
    ['', '','']
], names=['meter', 'Sleeper',''])
).rename_axis('Index')

df

meter   data1 data2 data3
Sleeper     F     K                             
Index                    
A           2     3     4
B           5     6     7
C           8     9    10

So I want to join level names and flatted the data following this solution Pandas dataframe with multiindex column - merge levels

  df.columns = df.columns.map('_'.join).str.strip('|')
  df.reset_index(inplace=True)

Getting this

 Index  data1_F_    data2_K_    data3__
0   A     2            3         4
1   B     5            6         7
2   C     8            9        10

but I dont want those _ end of the column names so I added

df.columns = df.columns.apply(lambda x: x[:-1] if x.endswith('_') else x)
df

But got

AttributeError: 'Index' object has no attribute 'apply'

How can I combine map and apply (flatten the column names and remove _ at the end of the column names in one run ?

expected output

  Index data1_F     data2_K   data3
0   A     2            3         4
1   B     5            6         7
2   C     8            9        10

Thanks


Solution

  • You can try this:

    df.columns = df.columns.map('_'.join).str.strip('_')
    df
    Out[132]: 
           data1_F  data2_K  data3
    Index                         
    A            2        3      4
    B            5        6      7
    C            8        9     10