Search code examples
pythonpython-3.xpandasmulti-index

Getting index value in pandas multiindex based on condition


I have created a dataframe with this code: The objective of this is to find the weekly low and to get the dates date at which the weekly low took place. To do this:

import pandas as pd
from pandas_datareader import data as web
import pandas_datareader
import datetime

df = web.DataReader('GOOG', 'yahoo', start, end)
df2 = web.DataReader('GOOG', 'yahoo', start, end)
start = datetime.datetime(2021,1,1)
end = datetime.datetime.today()

df['Date1'] = df.index

df['month'] = df.index.month
df['week'] = df.index.week
df['day'] = df.index.day
df.set_index('week',append=True,inplace=True)
df.set_index('day',append=True,inplace=True)

To get the weekly low :

df['Low'].groupby(['week']).min().tail(50)

enter image description here

I trying to find out the date on which weekly low occured: Such as 1735.420044

If i try to do this :

df['Low'].isin([1735.420044])

I get :

Date        week  day
2020-12-31  53    31     False
2021-01-04  1     4      False
2021-01-05  1     5      False
2021-01-06  1     6      False
2021-01-07  1     7      False
                         ...  
2021-08-02  31    2      False
2021-08-03  31    3      False
2021-08-04  31    4      False
2021-08-05  31    5      False
2021-08-06  31    6      False
Name: Low, Length: 151, dtype: bool

How can i get the actual dates for the low?


Solution

  • To get the weekly lows, you could simply access the index.

    res = df['Low'].groupby(['week']).min()
    

    res is the series of lowest prices with the date in the index. You can access the raw numpy array that represents the index with res.index.values. This will include week and day levels as well.

    To get just the dates as a series, this should work:

    dates = res.index.get_level_values("Date").to_series() 
    

    PS: Clarification from the comments

    df['Low'].isin([1735.420044]).any() # returns False
    

    The above doesn't work for you (should return True if there's a match) because when you say .isin([<bunch of floats>]), you are essentially comparing floats for equality. Which doesn't work because floating point comparisons can never be guaranteed to be exact, they always have to be in ranges of tolerance (this is not Python specific, true for all languages). Sometimes it might seem to work in Python, but that is entirely coincidental and is a result of underlying memory optimisations. Have a look at this thread to gain some (Python specific) insight into this.