Goal:
Calculate 50day moving average for each day, based on the past 50 days. I can calculate the mean for the entire dataset, but I am trying to contiously calculate the mean based on the past 50 days...with it changing each day of course!
import numpy as np
import pandas_datareader.data as pdr
import pandas as pd
# Define the instruments to download. We would like to see Apple, Microsoft and the S&P500 index.
ticker = ['AAPL']
#Define the data period that you would like
start_date = '2017-07-01'
end_date = '2019-02-08'
# User pandas_reader.data.DataReader to load the stock prices from Yahoo Finance.
df = pdr.DataReader(ticker, 'yahoo', start_date, end_date)
# Yahoo Finance gives 'High', 'Low', 'Open', 'Close', 'Volume', 'Adj Close'.
#Export Close PRice, Volume, and Date from yahoo finance
CloseP = df['Close']
CloseP.head()
Volm = df['Volume']
Volm.head()
Date = df["Date"] = df.index
#create a table with Date, Close Price, and Volume
Table = pd.DataFrame(np.array(Date), columns = ['Date'])
Table['Close Price'] = np.array(CloseP)
Table['Volume'] = np.array(Volm)
print (Table)
#create a column that contiosuly calculates 50 day MA
#This is what I can't get to work!
MA = np.mean(df['Close'])
Table['Moving Average'] = np.array(MA)
print (Table)
First of all, please, don't use CamelCase to name your variables, as they look as class names otherwise.
Next, use merge() to join your data frames instead of those yours np.array
way:
>>> table = CloseP.merge(Volm, left_index=True, right_index=True)
>>> table.columns = ['close', 'volume'] # give names to columns
>>> table.head(10)
close volume
Date
2017-07-03 143.500000 14277800.0
2017-07-05 144.089996 21569600.0
2017-07-06 142.729996 24128800.0
2017-07-07 144.179993 19201700.0
2017-07-10 145.059998 21090600.0
2017-07-11 145.529999 19781800.0
2017-07-12 145.740005 24884500.0
2017-07-13 147.770004 25199400.0
2017-07-14 149.039993 20132100.0
2017-07-17 149.559998 23793500.0
Finally, use combination of rolling(), mean() and dropna() to calculate moving average:
>>> ma50 = table.rolling(window=50).mean().dropna()
>>> ma50.head(10)
close volume
Date
2017-09-12 155.075401 26092540.0
2017-09-13 155.398401 26705132.0
2017-09-14 155.682201 26748954.0
2017-09-15 156.025201 27248670.0
2017-09-18 156.315001 27430024.0
2017-09-19 156.588401 27424424.0
2017-09-20 156.799201 28087816.0
2017-09-21 156.952201 28340360.0
2017-09-22 157.034601 28769280.0
2017-09-25 157.064801 29254384.0
Please, refer to the docs of mentioned API calls to get more info about their usage. Good luck!