I've an old cvs file with this Yahoo Finance data named 'AAPL.csv'
Date | Open | High | Low | Close | Volume |
---|---|---|---|---|---|
2023-09-25 | 174.19 | 176.97 | 174.14 | 176.08 | 46172700 |
2023-09-26 | 174.82 | 175.19 | 171.66 | 171.96 | 64588900 |
2023-09-27 | 172.61 | 173.03 | 169.05 | 170.42 | 66921800 |
2023-09-28 | 169.33 | 172.02 | 167.61 | 170.69 | 56294400 |
2023-09-29 | 172.02 | 173.07 | 170.33 | 171.21 | 51814200 |
I need to check today date with latest row date and update the latest days data and save in the same file. I've this code but have some problems with date format and not runs ok.
import os
import datetime
import pandas as pd
import dateutil.parser
from datetime import datetime
read_data = pd.read_csv('AAPL.csv')
read_last_date_df = str(read_data['Date'].values[-1])
last_date_df = dateutil.parser.isoparse(read_last_date_df)
today = datetime.today()
dif_day_dates = int(today.strftime('%d')) - int(last_date_df.strftime('%d'))
if dif_day_dates > 0:
update_data = yf.download(symbol, start=datetime.today() - last_date_df, end=datetime.today(), interval=timeframes_codedata['daily'])
read_data[len(read_data)] = update_data
read_data.to_csv('AAPL.csv')
Now i've this error: if dif_day_dates > 0: TypeError: '>' not supported between instances of 'datetime.timedelta' and 'int'
Any suggestion or idea to solve this problem and apply this function in my app? Thanks a lot
if dif_day_dates > 0:
is correct. The problem is:
update_data = yf.download(symbol, start=datetime.today() - last_date_df,
end=datetime.today(), interval=timeframes_codedata['daily'])
Let's look:
start = datetime.today() - last_date_df
print(start)
print(type(start))
'''
(datetime.timedelta(days=6, seconds=54260, microseconds=767898),)
<class 'tuple'>
'''
According to documentation
start
param takes a string expression and must be in (YYYY-MM-DD)
format.
def download(tickers, start=None, end=None, actions=False, threads=True, ignore_tz=None,
group_by='column', auto_adjust=False, back_adjust=False, repair=False, keepna=False,
progress=True, period="max", show_errors=None, interval="1d", prepost=False,
proxy=None, rounding=False, timeout=10, session=None):
"""Download yahoo tickers
:Parameters:
tickers : str, list
List of tickers to download
period : str
Valid periods: 1d,5d,1mo,3mo,6mo,1y,2y,5y,10y,ytd,max
Either Use period parameter or use start and end
interval : str
Valid intervals: 1m,2m,5m,15m,30m,60m,90m,1h,1d,5d,1wk,1mo,3mo
Intraday data cannot extend last 60 days
start: str
Download start date string (YYYY-MM-DD) or _datetime, inclusive.
Default is 99 years ago
E.g. for start="2020-01-01", the first data point will be on "2020-01-01"
end: str
Download end date string (YYYY-MM-DD) or _datetime, exclusive.
Default is now
E.g. for end="2023-01-01", the last data point will be on "2022-12-31"
To avoid this:
from datetime import timedelta
start = (last_date_df + timedelta(days=1)).strftime('%Y-%m-%d')
end = datetime.now().strftime('%Y-%m-%d')
Now you can use these two values in the yf.dowloand()
function.