Search code examples
pythonpandaspandas-datareader

Stock price Import issue for a newbie


a total newbie who started this week on python. I have been reading Datacamp and some other online resources as well as Python without fear.

I wanted to test and see if I can import some data prices and copied code from the internet. I cannot get it to work due to an error: TypeError: string indices must be integers on line 10

import pandas_datareader as pdr #needed to read data from yahoo

#df = pdr.get_data_yahoo('AAPL')
#print (df.Close)

stock =('AAPL')
start_date = '2017-01-01'
end_date = '2017-12-10'

closes = [c['Close'] for c in pdr.get_data_yahoo(stock, start_date, 
end_date)]

for c in closes:
    print (c)

The line closes = [c.......] is giving me an error.

Any advice on how to fix this? I am starting my journey and actually trying to import the close prices for past year for S&P500 and then save them to Excel. If there is a snippet which does this already and I can learn from, please let me know.

Thank you all.


Solution

  • The call to get_data_yahoo returns a single dataframe.

    df = pdr.get_data_yahoo(stock, start_date, end_date)
    df.head()
    
                      Open        High         Low       Close   Adj Close  \
    Date                                                                     
    2017-01-03  115.800003  116.330002  114.760002  116.150002  114.311760   
    2017-01-04  115.849998  116.510002  115.750000  116.019997  114.183815   
    2017-01-05  115.919998  116.860001  115.809998  116.610001  114.764473   
    2017-01-06  116.779999  118.160004  116.470001  117.910004  116.043915   
    2017-01-09  117.949997  119.430000  117.940002  118.989998  117.106812   
    
                  Volume  
    Date                  
    2017-01-03  28781900  
    2017-01-04  21118100  
    2017-01-05  22193600  
    2017-01-06  31751900  
    2017-01-09  33561900  
    
    type(df)
    pandas.core.frame.DataFrame
    

    Meanwhile, you're trying to iterate over this returned dataframe. By default, a for loop will iterate over the columns. For example:

    for c in df:
        print(c)
    
    Open
    High
    Low
    Close
    Adj Close
    Volume
    

    When you replicate this code in a list comp, c is given each column name in turn, and str[str] is an invalid operation.

    In summary, just doing closes = df['Closes'] on the returned result is sufficient to obtain the Closes column.