Querying DB2 from python using ODBC, I am seeing NULL
values converted to 0 (on Linux, seemingly corrupt but close to 0 on Mac M1 -- even more worryingly).
This is using the db2 docker image started like this:
docker run -itd --name db2 --privileged=true -p 50000:50000 -e LICENSE=accept -e DB2INST1_PASSWORD=xxxxx -e DBNAME=testdb -v <db storage dir>:/database ibmcom/db2
Code as follows recreates the issue:
import pyodbc
cs = "Driver={ODBC Driver v11.5.7 for DB2};Database=xxxxx;Hostname=xxxx;Port=50000;Protocol=TCPIP;Uid=xxxx;Pwd=xxxx;"
cnxn = pyodbc.connect(cs)
crsr = cnxn.cursor()
crsr.execute("SELECT CAST(NULL AS INT), CAST(NULL AS REAL) FROM SYSIBM.SYSDUMMY1");
print(crsr.messages)
print(crsr.fetchall())
Outputs:
❯ python float-test.py
[]
[(2, 4.2439915814e-314)]
Is it expected that I can't retrieve NULL values as plain data types? It seems to be allowed in PostgreSQL. I know I can cast around this but would rather not, obviously.
Extra Info
It does seem that the ODBC driver version 11.5.7 from Fix Central suffers this issue whilst the 11.5.6 version from https://public.dhe.ibm.com/ibmdl/export/pub/software/data/db2/drivers/odbc_cli does not.
As mentioned in comments, it appears pyodbc is impacted and both plain ibm_db and DBI are not impacted (both return None, None). So at least there is a workaround.
The reason for the behaviour deifference is that pyodbc is using SQLGetData()
while the other two use the SQLBindCol()
methods of extracting the result set data.
IBM's clidriver on Linux x64, SQLGetData()
sets the (SQLLEN *)
StrLen_Or_IndPtr parameter to SQL_NULL_DATA
when the value of the column in result-set is NULL. But the problem is that IBMs clidrver sets StrLen_or_IndPtr to SQL_NULL_DATA (as int, 4 bytes), when pyodbc code expects it to SQL_NULL_DATA (as SQLLEN, 8bytes on Linux x64) as SQLLEN
is the documented datatype for the StrLen_or_IndPtr argument.
Therefore the comparison in pyodbc getdata.cpp GetDataDouble()
:
if ( cbFetched == SQL_NULL_DATA )
Py_RETURN_NONE;
will be false, causing the code to return an unitialised variable instead of Py_None.
I do not know if the maintainers of pyodbc run their tests against a Db2-LUW product, but it looks like other parts of the code could suffer the same problem and other issues may lurk. Consider asking on github what is the support policy for Db2-LUW in pyodbc.
If you have a support contract, IBM should also be asked to comment on their reason for not respecting the datatype of StrLen_Or_IndPtr when writing SQL_NULL_DATA to this parameter on Linux x64.