Search code examples
pythonpandasmulti-index

Convert pandas dataframe to multi-index columns


I've got a dataframe like this:

     a  b  c
foo  1  6  9
bar  2  4  8
fud  3  5  7

And I want to convert it to this:

     a        b        c    
  name num name num name num
0  foo   1  bar   4  fud   7
1  bar   2  fud   5  bar   8
2  fud   3  foo   6  foo   9

i.e. group each column as a name and number pair, with the numbers sorted with corresponding names that used to be indices.

I can do it with a loop, but I keep thinking there must be a more 'pandasy' way to do it. This is the code I used for the above:

import pandas as pd

my_index=['foo','bar','fud']
orig = pd.DataFrame({'a': [1,2,3], 'b':[6,4,5], 'c':[9,8,7]}, index=my_index)
multi = pd.MultiIndex.from_product([['a','b','c'],['name','num']])
x = pd.DataFrame(index=range(3), columns=multi)

for h in orig.columns:
    s = orig[h].sort_values().reset_index()
    x[h,'name'] = s['index']
    x[h,'num'] = s[h]

I'm sure there's a better way to do this, though, so if a pandas expert can help me out, it would be much appreciated.

Thanks!


Solution

  • pandas

    def proc(s):
        return s.sort_values().rename_axis('name').reset_index(name='num')
    
    pd.concat({j: proc(c) for j, c in df.iteritems()}, axis=1)
    
         a        b        c    
      name num name num name num
    0  foo   1  bar   4  fud   7
    1  bar   2  fud   5  bar   8
    2  fud   3  foo   6  foo   9
    

    dash of numpy

    v = df.values
    a = v.argsort(0)
    r = np.arange(v.shape[1])[None, :]
    
    nums = pd.DataFrame(v[a, r], columns=df.columns)
    names = pd.DataFrame(df.index.values[a], columns=df.columns)
    
    pd.concat(
        [names, nums],
        axis=1,
        keys=['names', 'nums']
    ).swaplevel(0, 1, 1).sort_index(1)
    
         a        b        c    
      name num name num name num
    0  foo   1  bar   4  fud   7
    1  bar   2  fud   5  bar   8
    2  fud   3  foo   6  foo   9