Search code examples
pythonpython-3.xpandasdataframemulti-index

ValueError: Length mismatch: Expected axis has 0 elements while creating hierarchical columns in pandas dataframe


I was going through the documentation about the hierarchical indexing in Pandas. I tried testing the examples from it to create an empty dataframe with hierarchical indexing:

In [5]: df = pd.DataFrame()

In [6]: df.columns = pd.MultiIndex(levels = [['first', 'second'], ['a', 'b']], labels = [[0, 0, 1, 1], [0, 1, 0, 1]])

However, it throws an error:

ValueError                                Traceback (most recent call last)
<ipython-input-6-dd823f9b8d22> in <module>()
----> 1 df.columns = pd.MultiIndex(levels = [['first', 'second'], ['a', 'b']], labels = [[0, 0, 1, 1], [0, 1, 0, 1]])

/usr/local/lib/python3.4/dist-packages/pandas/core/generic.py in __setattr__(self, name, value)
   2755         try:
   2756             object.__getattribute__(self, name)
-> 2757             return object.__setattr__(self, name, value)
   2758         except AttributeError:
   2759             pass

pandas/src/properties.pyx in pandas.lib.AxisProperty.__set__ (pandas/lib.c:44873)()

/usr/local/lib/python3.4/dist-packages/pandas/core/generic.py in _set_axis(self, axis, labels)
    446 
    447     def _set_axis(self, axis, labels):
--> 448         self._data.set_axis(axis, labels)
    449         self._clear_item_cache()
    450 

/usr/local/lib/python3.4/dist-packages/pandas/core/internals.py in set_axis(self, axis, new_labels)
   2800             raise ValueError('Length mismatch: Expected axis has %d elements, '
   2801                              'new values have %d elements' %
-> 2802                              (old_len, new_len))
   2803 
   2804         self.axes[axis] = new_labels

ValueError: Length mismatch: Expected axis has 0 elements, new values have 4 elements

I don't see any problem with my code. Any ideas what is happening?


Solution

  • The problem is that you have an empty data frame which has zero columns, and you are trying to assign a four columns multi-index to it; If you create an empty data frame of four columns initially, the error will be gone:

    df = pd.DataFrame(pd.np.empty((0, 4)))    
    df.columns = pd.MultiIndex(levels = [['first', 'second'], ['a', 'b']], labels = [[0, 0, 1, 1], [0, 1, 0, 1]])
    

    Or you can create empty data frame with the multi-index as follows:

    multi_index = pd.MultiIndex(levels = [['first', 'second'], ['a', 'b']], labels = [[0, 0, 1, 1], [0, 1, 0, 1]])    
    df = pd.DataFrame(columns=multi_index)
    
    df
    #   first    second
    #  a    b   a     b