Search code examples
pythondatepandasformatpandas-datareader

Writing a date from a pandas_datareader to database results in -1999


I have written a simple python application to read stock prices from google using the pandas_datareader and store them in the database. I find it strange that when I write the dataset the dates are correct however when I extract the date and use it in an update statement, the date is messed up. Here is my code (excluding import statements and logic) and results:

df = webd.DataReader('WTB.L', 'google', dtStart, dtEnd)
print(df.head())
              Open    High     Low   Close  Volume
Date                                              
2012-03-02  1687.0  1687.0  1672.0  1672.0  341944
2012-03-05  1666.0  1684.0  1660.0  1665.0  333824
lastPriceDateX = pd.to_datetime(df['Date'].tail(1).item())
lastPriceDateY = lastPriceDateX
lastPriceDate = lastPriceDateY.strftime('%d-%m-%Y').isoformat()
print('Last Price Date {}'.format(lastPriceDate))

Last Price Date 21-03-2017

Looks good so far, the date format is exactly as I wanted it! Now I write the date into the SQLITE database: date is stored as -1999

The dataset is written to SQLite database, this has the correct date format:

"12667" "2017-03-16 00:00:00" "WTB.L" "3926.0" "3936.0" "3882.0" "3909.0" "441329"
"12668" "2017-03-17 00:00:00" "WTB.L" "3908.0" "3926.0" "3892.0" "3903.0" "642291"
"12669" "2017-03-20 00:00:00" "WTB.L" "3907.0" "3917.0" "3883.32" "3916.0" "175681"
"12670" "2017-03-21 00:00:00" "WTB.L" "3921.0" "3926.0" "3888.0" "3914.0" "315763"

Code to write this dataset:

df.to_sql('tblStockPricesGoogle', conn,
          if_exists='append', index=False,
          index_label=None, chunksize=None, dtype=None)

I could write a python function to take the latest date from the prices table and write it to the last price. However, I would like to understand why this date prints correctly but it does not write to the database correctly.

Thank you all very much.


More code:

#Gets data from google.
    try:
        print('Try')
        df = webd.DataReader('WTB.L', 'google', dtStart, dtEnd)
        print(df.head())
        df.insert(0,"Symbol", sy)

        df = df.reset_index()

        if df.empty :
            print('DataFrame is empty. There could be various issues. No updates.')

        else :
            print(df.head(1))
            print("Starting update:")

            #Call update function and pass dataframe.

            if sql3.saveDFtoSQL(df):
                #update table with last price, date, updated date etc.

                index is also returned.
                lastPrice = df['Close'].tail(1).item()
                lastPriceDateX = pd.to_datetime(df['Date'].tail(1).item())
                lastPriceDateY = lastPriceDateX
                lastPriceDate = lastPriceDateY.strftime('%d-%m-%Y')
                print("Updated {} prices, task done!".format(sy))
                print('Last  Price {}'.format(lastPrice))
                print('Last Price Date {}'.format(lastPriceDate))
                                    lastUpdate = dt.date(today.year,today.month,today.day).isoformat()
                print('LastUpdate attribute:',lastUpdate)
                sql3.updateListLastUp(sy,lastUpdate,lastPrice,lastPriceDate)


def updateListLastUp(symbol,date,lastPrice,lastPriceDate):  

    try:    
        strUpdate = 'UPDATE tblList SET lastUpdate="{}", LastPrice={},LastPriceDate={}, GetData=0 WHERE Ticker="{}"'.format(date,lastPrice,lastPriceDate,symbol)
        conn = sq3.connect(sql_file)
        conn.execute(strUpdate)
        conn.commit()
        return 'Done'

    except sq3.Error as er:

        print('Failure to update DataSet:', er)
        return er.tostring()

Solution

  • In:

    strUpdate = 'UPDATE tblList SET lastUpdate="{}", LastPrice={},LastPriceDate={}, GetData=0 WHERE Ticker="{}"'.format(date,lastPrice,lastPriceDate,symbol)
    

    You're using string interpolation (a big faux pas as it's insecure as well as leads to surprising results if values aren't escaped appropriately) to construct your query string, so what the DB engine sees is "2017-03-15" which is interpreted as 2017 - 3 - 15 and calculates the value which is 1999 etc...

    All your problems will go away if you use query parameters, eg:

    strUpdate = 'update tblList set lastUpdate=?, lastPrice=?, lastPriceDate=?, GetData=0 where Ticker=?'
    # ...
    conn.execute(strUpdate, (date, lastPrice, lastPriceDate, symbol))