Search code examples
pythonpandasreindex

How to make a hierarchical index from data in the columns?


I have tried and searched for two days on this (I'm still quite the noob). Can anyone help me with this probably simple task?

I am practicing with data in this format:

df = DataFrame({'year':[2010]*5 + [2011]*5, 'names':['a','b','c','d','e']*2, 'births':[1,2,3,4,5,6,7,8,9,10]})


    births names  year
0       1     a  2010
1       2     b  2010
2       3     c  2010
3       4     d  2010
4       5     e  2010
5       6     a  2011
6       7     b  2011
7       8     c  2011
8       9     d  2011
9      10     e  2011

And I want to get it in this format:

Year Name   births
2010   a    1
       b    2
       c    3
       d    4
       e    5
2011   a    6
       b    7
       c    8
       d    9
       e    10

I want it like this so I can easily access it with a combined primary key like df.ix('2010','a') --- I don't know if this is even possible or if it is how to reference it

Can anyone explain how I do this? Thank you!


Solution

  • df = df.set_index(['year', 'names']) will give you what you want. You can access elements as

    In[781]: df.set_index(['year', 'names']).xs(2010)
    Out[777]: 
           births
    names        
    a           1
    b           2
    c           3
    d           4
    e           5
    In[782]: df.set_index(['year', 'names']).xs([2010, 'a'])
    Out[778]: 
    births    1
    Name: (2010, a), dtype: int64