Search code examples
pythonpandaseclipseanalytics

Pandas Indexing Creates ERROR ( Result = self._data[key] ): HEELP


Error when using Pandas and indexing. why? I even asked Derek Banas and he wasn't sure why didn't work so please help Errors at bottom in 'quote format'

This my code:

import numpy as np 

import pandas as pd
from pandas_datareader import data as web
import matplotlib.pyplot as plt 
import matplotlib.dates as mdates
%matplotlib inline

import datetime as dt #For defining dates
import mplfinance as mpf # Matplotlib finance

import time
import yfinance as yf;
# Used to get data from a directory
import os
from os import listdir
from os.path import isfile, join
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', 5);
#Statsmodels is a great library we can use to run regressions.
import statsmodels.api as sm
# Seaborn extends the capabilities of Matplotlib
import seaborn as sns
# Used for calculating regressions
from statsmodels.tsa.ar_model import AutoReg, ar_select_order

msft= yf.download(tickers='MSFT', period = '1mo',interval = '5m');
x = msft.index; close = msft.index['Adj Close'], high = msft['High']; low = msft['Low']; openprice=msft['Open'];
#print(x); print(high);

print(msft) 

Error is:

x = msft.index; close = msft.index['Adj Close'];

---- " [my addrs]...\pandas\core\indexes\extension.py, line 238 in getitem

result = self._data[key] Following a tutorial https://youtu.be/boouvnzw-G8?t=582

I'm using Eclipse on windows. Why doesn't Python work when I index whatever's in the video? Appreciate it, I'm a lowly student and cannot figure why my copy-paste of the github repositry code won't work. I even asked Derek Banas and he wasn't sure why didn't work so please help


Solution

  • The problem is that msft.index is a an index not a column of your DataFrame. Whereas you are trying to access it as if it is a column (msft.index[<ColumnName>]).
    It should be simply msft[<ColumnName>]. See below for details:

    Run:

    msft= yf.download(tickers='MSFT', period = '1mo',interval = '5m');
    type(msft.index)
    

    Output:

    pandas.core.indexes.datetimes.DatetimeIndex
    

    Run:

    msft.head()
    

    Output:

    
                                Open       High         Low        Close        Adj Close   Volume
    Datetime                        
    2021-11-04 09:30:00-04:00   332.890015  333.239990  329.640015  330.549988  330.549988  1619634
    2021-11-04 09:35:00-04:00   330.000000  330.589996  329.859985  330.390015  330.390015  966856
    2021-11-04 09:40:00-04:00   330.500000  332.299988  330.450104  332.269989  332.269989  760918
    2021-11-04 09:45:00-04:00   332.250000  333.100006  332.170013  332.989990  332.989990  526600
    2021-11-04 09:50:00-04:00   332.750000  333.179993  332.510010  333.174988  333.174988  498996
    

    You need to access it this way (get rid of .index):

    Run:

    msft['Adj Close']
    

    Output:

    Datetime
    2021-11-04 09:30:00-04:00    330.549988
    2021-11-04 09:35:00-04:00    330.390015
    2021-11-04 09:40:00-04:00    332.269989
    2021-11-04 09:45:00-04:00    332.989990
    2021-11-04 09:50:00-04:00    333.174988
                                    ...    
    2021-12-03 15:35:00-05:00    320.594604
    2021-12-03 15:40:00-05:00    320.304993
    2021-12-03 15:45:00-05:00    320.317505
    2021-12-03 15:50:00-05:00    321.679993
    2021-12-03 15:55:00-05:00    323.149994
    Name: Adj Close, Length: 1603, dtype: float64
    

    I'm guessing this error was the result of a copy/paste typo (because the video does not have it as you do with .index).

    I would strongly echo the advise of @WesleyJonCheek : "One statement per line"