Update only latest data from yfinance with pandas and datetime

I've an old cvs file with this Yahoo Finance data named 'AAPL.csv'

Date	Open	High	Low	Close	Volume
2023-09-25	174.19	176.97	174.14	176.08	46172700
2023-09-26	174.82	175.19	171.66	171.96	64588900
2023-09-27	172.61	173.03	169.05	170.42	66921800
2023-09-28	169.33	172.02	167.61	170.69	56294400
2023-09-29	172.02	173.07	170.33	171.21	51814200

I need to check today date with latest row date and update the latest days data and save in the same file. I've this code but have some problems with date format and not runs ok.

import os
import datetime
import pandas as pd
import dateutil.parser
from datetime import datetime

read_data = pd.read_csv('AAPL.csv')
read_last_date_df = str(read_data['Date'].values[-1])
last_date_df = dateutil.parser.isoparse(read_last_date_df)

today = datetime.today()
dif_day_dates = int(today.strftime('%d')) - int(last_date_df.strftime('%d'))

if dif_day_dates > 0:
   update_data = yf.download(symbol, start=datetime.today() - last_date_df, end=datetime.today(), interval=timeframes_codedata['daily'])
   read_data[len(read_data)] = update_data
   read_data.to_csv('AAPL.csv')

Now i've this error: if dif_day_dates > 0: TypeError: '>' not supported between instances of 'datetime.timedelta' and 'int'

Any suggestion or idea to solve this problem and apply this function in my app? Thanks a lot

Solution

if dif_day_dates > 0: is correct. The problem is:

update_data = yf.download(symbol, start=datetime.today() - last_date_df,
end=datetime.today(), interval=timeframes_codedata['daily'])

Let's look:

start = datetime.today() - last_date_df
print(start)
print(type(start))
'''
(datetime.timedelta(days=6, seconds=54260, microseconds=767898),)
<class 'tuple'>
'''

According to documentation start param takes a string expression and must be in (YYYY-MM-DD) format.

def download(tickers, start=None, end=None, actions=False, threads=True, ignore_tz=None,
             group_by='column', auto_adjust=False, back_adjust=False, repair=False, keepna=False,
             progress=True, period="max", show_errors=None, interval="1d", prepost=False,
             proxy=None, rounding=False, timeout=10, session=None):
    """Download yahoo tickers
    :Parameters:
        tickers : str, list
            List of tickers to download
        period : str
            Valid periods: 1d,5d,1mo,3mo,6mo,1y,2y,5y,10y,ytd,max
            Either Use period parameter or use start and end
        interval : str
            Valid intervals: 1m,2m,5m,15m,30m,60m,90m,1h,1d,5d,1wk,1mo,3mo
            Intraday data cannot extend last 60 days
        start: str
            Download start date string (YYYY-MM-DD) or _datetime, inclusive.
            Default is 99 years ago
            E.g. for start="2020-01-01", the first data point will be on "2020-01-01"
        end: str
            Download end date string (YYYY-MM-DD) or _datetime, exclusive.
            Default is now
            E.g. for end="2023-01-01", the last data point will be on "2022-12-31"

To avoid this:

from datetime import timedelta
start = (last_date_df +  timedelta(days=1)).strftime('%Y-%m-%d')
end = datetime.now().strftime('%Y-%m-%d')

Now you can use these two values in the yf.dowloand() function.