I'm trying to split a data set into a training and testing part. I am struggling at a structural problem as it seems as the hierarchy of the data seems to be wrong to proceed with below code.
I tried the following:
import pandas as pd
data = pd.DataFrame(web.DataReader('SPY', data_source='morningstar')['Close'])
cutoff = '2015-1-1'
data = data[data.index < cutoff].dropna().copy()
As data.head()
will reveal, data
is not actually a pd.DataFrame
but a pd.Series
whose index is a pd.MultiIndex
(as suggested also by the error which hints that each element is a tuple) rather than a pd.DatetimeIndex
.
What you could do would be to simply let
df = data.unstack(0)
With that, df[df.index < cutoff]
performs the filtering you are trying to do.