Search code examples
pythonpandasdataframemulti-indexhierarchical-data

Data structure to provide to DataFrame with MultiIndex


I added some headers with MultiIndex but the data structure to provide to DataFrame changed. In fact, I used to use a dictionary of lists but it's not working now.

This is my code:

data_1 = {'a': ['d', 'd', 'd'], 
          'b': ['4', 'f', '4'], 
          'c': ['f', 't', 't'], 
          'd': ['38', 'B24t', 'dod']
}
col = pd.MultiIndex.from_arrays([['one', 'one', 'one', 'two'], 
                                 ['a', 'b', 'c', 'd']]
)
data = pd.DataFrame(data_1, columns=col)

But the columns' values are empty:
enter image description here

This is what I am trying to do:
enter image description here


Solution

  • set_axis

    By trying to pass the col object to the DataFrame constructor, Pandas attempted to line up the keys of data_1 with the values in col... and it didn't line up. So you can set_axis after the fact instead.

    data = pd.DataFrame(data_1).set_axis(col, axis=1)
    
    data
    
      one         two
        a  b  c     d
    0   d  4  f    38
    1   d  f  t  B24t
    2   d  4  t   dod
    

    Alternatively, you could provide the MultiIndex information in the keys of the dictionary passed to the constructor.

    data_1 = {
        ('one', 'a'): ['d', 'd', 'd'],
        ('one', 'b'): ['4', 'f', '4'],
        ('one', 'c'): ['f', 't', 't'],
        ('two', 'd'): ['38', 'B24t', 'dod']
    }
    data = pd.DataFrame(data_1)
    
    data
    
      one         two
        a  b  c     d
    0   d  4  f    38
    1   d  f  t  B24t
    2   d  4  t   dod