Search code examples
pythonpandasseriesmulti-index

Pandas reset index on series to remove multiindex


I have a Series that looks like this:

1999-03-31  SOLD_PRICE     NaN
1999-06-30  SOLD_PRICE     NaN
1999-09-30  SOLD_PRICE     NaN
1999-12-31  SOLD_PRICE    3.00
2000-03-31  SOLD_PRICE    3.00

with an index that looks like:

MultiIndex
[(1999-03-31 00:00:00, u'SOLD_PRICE'), (1999-06-30 00:00:00, u'SOLD_PRICE'), 
 (1999-09-30 00:00:00, u'SOLD_PRICE'), (1999-12-31 00:00:00, u'SOLD_PRICE'),...]

I don't want the second column as an index. Ideally I'd have a DataFrame with column 1 as "Date" and column 2 as "Sales" (dropping the second level of the index). I don't quite see how to reconfigure the index.


Solution

  • Just call reset_index():

    In [130]: s
    Out[130]:
    0           1
    1999-03-31  SOLD_PRICE   NaN
    1999-06-30  SOLD_PRICE   NaN
    1999-09-30  SOLD_PRICE   NaN
    1999-12-31  SOLD_PRICE     3
    2000-03-31  SOLD_PRICE     3
    Name: 2, dtype: float64
    
    In [131]: s.reset_index()
    Out[131]:
                0           1   2
    0  1999-03-31  SOLD_PRICE NaN
    1  1999-06-30  SOLD_PRICE NaN
    2  1999-09-30  SOLD_PRICE NaN
    3  1999-12-31  SOLD_PRICE   3
    4  2000-03-31  SOLD_PRICE   3
    

    There are many ways to drop columns:

    Call reset_index() twice and specify a column:

    In [136]: s.reset_index(0).reset_index(drop=True)
    Out[136]:
                0   2
    0  1999-03-31 NaN
    1  1999-06-30 NaN
    2  1999-09-30 NaN
    3  1999-12-31   3
    4  2000-03-31   3
    

    Delete the column after resetting the index:

    In [137]: df = s.reset_index()
    
    In [138]: df
    Out[138]:
                0           1   2
    0  1999-03-31  SOLD_PRICE NaN
    1  1999-06-30  SOLD_PRICE NaN
    2  1999-09-30  SOLD_PRICE NaN
    3  1999-12-31  SOLD_PRICE   3
    4  2000-03-31  SOLD_PRICE   3
    
    In [139]: del df[1]
    
    In [140]: df
    Out[140]:
                0   2
    0  1999-03-31 NaN
    1  1999-06-30 NaN
    2  1999-09-30 NaN
    3  1999-12-31   3
    4  2000-03-31   3
    

    Call drop() after resetting:

    In [144]: s.reset_index().drop(1, axis=1)
    Out[144]:
                0   2
    0  1999-03-31 NaN
    1  1999-06-30 NaN
    2  1999-09-30 NaN
    3  1999-12-31   3
    4  2000-03-31   3
    

    Then, after you've reset your index, just rename the columns

    In [146]: df.columns = ['Date', 'Sales']
    
    In [147]: df
    Out[147]:
             Date  Sales
    0  1999-03-31    NaN
    1  1999-06-30    NaN
    2  1999-09-30    NaN
    3  1999-12-31      3
    4  2000-03-31      3