Search code examples
pythonpandasdataframeindexingmulti-index

Set MultiIndex of an existing DataFrame in pandas


I have a DataFrame that looks like

  Emp1    Empl2           date       Company
0    0        0     2012-05-01         apple
1    0        1     2012-05-29         apple
2    0        1     2013-05-02         apple
3    0        1     2013-11-22         apple
18   1        0     2011-09-09        google
19   1        0     2012-02-02        google
20   1        0     2012-11-26        google
21   1        0     2013-05-11        google

I want to pass the company and date for setting a MultiIndex for this DataFrame. Currently it has a default index. I am using

df.set_index(['Company', 'date'], inplace=True)

But when I print, it prints None. Is this not the correct way of doing it? Also I want to shuffle the positions of the columns company and date so that company becomes the first index, and date becomes the second in Hierarchy. Any ideas on this?


Solution

  • When you pass inplace in makes the changes on the original variable and returns None, and the function does not return the modified dataframe, it returns None.

    is_none = df.set_index(['Company', 'date'], inplace=True)
    df  # the dataframe you want
    is_none # has the value None
    

    so when you have a line like:

    df = df.set_index(['Company', 'date'], inplace=True)
    

    it first modifies df... but then it sets df to None!

    That is, you should just use the line:

    df.set_index(['Company', 'date'], inplace=True)