Search code examples
pythonmysqlintegerint64

MySql Python Connector turns INT into int64, but not back again?


I am using the MySQL Python connector to manipulate a database, but running into issues when my queries involve the INT database type. When MySQL retrieves an INT column from the database, it seems to convert to a Python int64. This is fine, except it doesn't convert it back into a usable MySql type.

Here's a reduced example:

This is my MySql schema for table 'test', with Id as datatype INT: enter image description here

enter image description here

My Python code is below. The 2nd execute (an UPDATE query) fails with this exception:

Exception Thrown: Failed processing format-parameters; Python 'int64' cannot be converted to a MySQL type

If I explicitly convert the 'firstId' parameter (which is reported as type <class 'numpy.int64'>), using int(firstId), the code runs successfully: as per another SO answer. I would have, perhaps naively, assumed that if MySql managed the conversion in one direction, it would manage it in the other. As it is, I don't necessarily know the types that I am getting from my actual query (I'm using Python ... I shouldn't have to know). Does this mean that I will have to type-check all my Python variables before running MySql queries?

I tried changing the table column datatype from INT to BIGINT (an 64-bit INT), but I got the same conversion error. Is there perhaps a 32-bit / 64-bit mismatch on the MySql connector package I am using (mysql-connector-python 8.0.23)?

import mysql.connector as msc
import pandas as pd

def main():
    dbConn = msc.connect(user='********', password='********',
                              host='127.0.0.1',
                              database='********')
    #Open a cursor
    cursor = dbConn.cursor()
    #Find Id of given name
    cursor.execute('SELECT * from test WHERE Name = %s',['Hector'])
    headers = cursor.column_names
    queryVals = list()

    for row in cursor:
            queryVals.append(row)
        
    cursor.close()   
    dfQueryResult = pd.DataFrame(queryVals,columns = headers)
    
    print(dfQueryResult)

    #Change name 
    firstId = dfQueryResult['Id'].iloc[0]
    print('firstId is of type: ',type(firstId))

    cursor = dbConn.cursor()
    cursor.execute('UPDATE test SET Name =%s WHERE Id =%s',['Graham',firstId]) #This line gives the error
    print(cursor.rowcount,' rows updated')
        
    cursor.close()   

    dbConn.commit()
    dbConn.close()

main()

Solution

  • First off, hat-tip to @NonoLondon for their comments and investigative work.

    A pandas Dataframe stores numbers using NumPy types. In this case, the DataFrame constructor was taking a Python 'int' from the MySql return and converting it into a Numpy.int64 object. When this variable was used again by MySql, the connector could not convert the Numpy.int64 back to a straight Python 'int'.

    From other SO articles, I discovered the item() method for all Numpy data types, which converts into base Python types. Since all Numpy data types are derived from the base class Numpy.generic, I'm now using the following utility function whenever I extract variables from DataFrames:

    import numpy as np
    
    def pyTypeFromNp(val):
        if isinstance(val,np.generic):
            return val.item()
    
        return val
    

    Hence the amended line is now:

    firstId = pyTypeFromNp(dfQueryResult['Id'].iloc[0])
    

    and the code runs as expected