Search code examples
pythonpandasdataframeseriesmulti-index

How to fill in gaps of duplicate indices in dataframe?


I have a dataframe like as shown below

tdf = pd.DataFrame({'grade': np.random.choice(list('AAAD'),size=(5)),
                   'dash': np.random.choice(list('PPPS'),size=(5)),
                   'dumeel': np.random.choice(list('QWRR'),size=(5)),
                   'dumma': np.random.choice((1234),size=(5)),
                   'target': np.random.choice([0,1],size=(5))
})

I am trying to create a multi-index dataframe using some of the input columns

So, I tried the below

tdf.set_index(['grade','dumeel'],inplace=True)

However, this results in missing/gap for duplicate entries (in red highlight)

enter image description here

How can I avoid that and show my dataframe with all indices (whether it is duplicate or not)

I would like to my output to have all rows with corresponding indices based on original dataframe


Solution

  • It is only display issue:

    tdf.set_index(['grade','dumeel'],inplace=True)
    
    print (tdf)
                 dash  dumma  target
    grade dumeel                    
    A     W         S    855       1
          R         P    498       1
          R         P    378       0
          W         P    211       0
          W         P     12       0
          
    with pd.option_context("display.multi_sparse", False):
        print (tdf)
                 dash  dumma  target
    grade dumeel                    
    A     W         S    855       1
    A     R         P    498       1
    A     R         P    378       0
    A     W         P    211       0
    A     W         P     12       0