Search code examples
pandasmulti-index

pandas Multiindex - set_index with list of tuples


I experienced following issue. I have an existing MultiIndex and want to replace the single level with a list of tuples. But I got some strange value error

Code to reproduce:

idx = pd.MultiIndex.from_tuples([(1, u'one'), (1, u'two'),
                                  (2, u'one'), (2, u'two')],
                                  names=['foo', 'bar'])

idx.set_levels([3, 5], level=0) # works fine
idx.set_levels([(1,2),(3,4)], level=0) #TypeError: Levels must be list-like

Can anyone comment: 1) What's the issue? 2) What's the best method to replace index (int values -> tuple values) Thanks!


Solution

  • For me working new contructor:

    idx = pd.MultiIndex.from_product([[(1,2),(3,4)], idx.levels[1]], names=idx.names)
    print (idx)
    MultiIndex(levels=[[(1, 2), (3, 4)], ['one', 'two']],
               labels=[[0, 0, 1, 1], [0, 1, 0, 1]],
               names=['foo', 'bar'])
    

    EIT1:

    df = pd.DataFrame({'A':list('abcdef'),
                       'B':[1,2,1,2,2,1],
                       'C':[7,8,9,4,2,3],
                       'D':[1,3,5,7,1,0],
                       'E':[5,3,6,9,2,4],
                       'F':list('aaabbb')}).set_index(['B','C'])
    
    
    #dynamic generate dictioanry with list of tuples
    new = [(1, 2), (3, 4)]
    d = dict(zip(df.index.levels[0], new))
    print (d)
    {1: (1, 2), 2: (3, 4)}
    
    #explicit define dictionary 
    d = {1:(1,2), 2:(3,4)}
    
    #rename first level of MultiInex
    df = df.rename(index=d, level=0)
    print (df)
              A  D  E  F
    B      C            
    (1, 2) 7  a  1  5  a
    (3, 4) 8  b  3  3  a
    (1, 2) 9  c  5  6  a
    (3, 4) 4  d  7  9  b
           2  e  1  2  b
    (1, 2) 3  f  0  4  b
    

    EDIT:

    new = [(1, 2), (3, 4)]
    lvl0 = list(map(tuple, np.array(new)[pd.factorize(idx.get_level_values(0))[0]].tolist()))
    print (lvl0)
    [(1, 2), (1, 2), (3, 4), (3, 4)]
    
    idx = pd.MultiIndex.from_arrays([lvl0, idx.get_level_values(1)], names=idx.names)
    print (idx)
    MultiIndex(levels=[[(1, 2), (3, 4)], ['one', 'two']],
               labels=[[0, 0, 1, 1], [0, 1, 0, 1]],
               names=['foo', 'bar'])