Search code examples
pythonpandasmulti-index

Creating an empty MultiIndex


I would like to create an empty DataFrame with a MultiIndex before assigning rows to it. I already found that empty DataFrames don't like to be assigned MultiIndexes on the fly, so I'm setting the MultiIndex names during creation. However, I don't want to assign levels, as this will be done later. This is the best code I got to so far:

def empty_multiindex(names):
    """
    Creates empty MultiIndex from a list of level names.
    """
    return MultiIndex.from_tuples(tuples=[(None,) * len(names)], names=names)

Which gives me

In [2]:

empty_multiindex(['one','two', 'three'])

Out[2]:

MultiIndex(levels=[[], [], []],
           labels=[[-1, -1, -1], [-1, -1, -1], [-1, -1, -1]],
           names=[u'one', u'two', u'three'])

and

In [3]:
DataFrame(index=empty_multiindex(['one','two', 'three']))

Out[3]:
one two three
NaN NaN NaN

Well, I have no use for these NaNs. I can easily drop them later, but this is obviously a hackish solution. Anyone has a better one?


Solution

  • The solution is to leave out the labels. This works fine for me:

    >>> import pandas as pd
    >>> my_index = pd.MultiIndex(levels=[[],[],[]],
    ...                          codes=[[],[],[]],
    ...                          names=[u'one', u'two', u'three'])
    >>> my_index
    MultiIndex([], names=['one', 'two', 'three'])
    >>> my_columns = [u'alpha', u'beta']
    >>> df = pd.DataFrame(index=my_index, columns=my_columns)
    >>> df
    Empty DataFrame
    Columns: [alpha, beta]
    Index: []
    >>> df.loc[('apple','banana','cherry'),:] = [0.1, 0.2]
    >>> df
                        alpha beta
    one   two    three
    apple banana cherry   0.1  0.2
    

    For Pandas Version < 0.25.1: The keyword labels can be used in place of codes