Search code examples
pythonpandastimestampmulti-index

pandas - slice of multiindex not adjusting index values as expected


In Pandas, I am trying to filter out rows with specific dates (set as first level of a multiindex) in a dataframe.

Once filtered, I'd like to check whether the last index value for the first level matches with my latest date. However, I can't get Pandas to return the right value.

An example may be helpful. I first create the original df with multiindex:

index = pd.date_range('2016-01-01', freq='B', periods=10), ["AAPL", "GOOG"]
df = pd.DataFrame(index=pd.MultiIndex.from_product(index))
print df

Then I filter out specific dates:

start, end = df.index.levels[0][1], df.index.levels[0][-4]
print start, end

Now, I create my filtered df only including dates from start till end:

df2 = df.loc[start:end]
df2

This looks fine, as expected. "01/12/2016" is my last index date.

Then, when I check the last index value for first level(0), it returns "01/14/16" instead of my chosen end date ("01/12/2016").

print df2.index.levels[0][-1]

How can I get the last date from df2? Am I missing something or is this a bug?


Solution

  • Look at df2.index, it is not what you think. It contains the information necessary to reconstruct the multi-index, that's all.

    If you want to access index values, use get_level_values:

    df2.index.get_level_values(0)
    

    Then df2.index.get_level_values(0)[-1] should return what you expected.