Search code examples
rjdbcnullr-dbi

Can autoconversion of db NULL to 0 be prevented when using RJDBC?


I am connecting to a H2 database using RJDBC (DBI). Fields that are null in the database are being implicitly converted to 0 in R dataframes resulting from calls the dbReadTable or dbGetQuery functions.

I have struggled to find any documentation on this but have found at least one other reference to the problem from someone using RJDBC connecting to a mysql database so I don't currently expect the problem to be with H2.

I would much prefer a NA or NULL where the database value is null. Is there some optional parameter or other means of preventing the RJDBC driver with DBI in R from implicitly replacing database null values?

EDIT: This appears to only happen for columns with a numeric type. VARCHAR based columns converted to factors appear to have NAs for database NULLs. Is implicit 0 conversion a known bug (or dangerous feature) of RJDBC handling NULLs in numeric columns?


Solution

  • It turns out the implicit null to 0 conversion is a "feature" of the JDBC API that RJDBC is a thin wrapper around.

    http://download.oracle.com/javase/6/docs/api/java/sql/ResultSet.html#getDouble(int)

    Consider me flabbergasted.

    The latest development version of RJDBC 0.1-6 available on CRAN now works around this mendacious behavior and correctly returns NA for numeric SQL nulls.