Search code examples
sqoop

sqoop import fails with numeric overflow


sqoop import job failed caused by: java.sql.SQLException: Numeric Overflow I have to load Oracle table, it has column type NUMBER in Oracle,without scale, and it's converted to DOUBLE in hive. This is the biggest possible size for both, Oracle and Hive numeric values. The question is how to overcome this error?


Solution

  • OK, my first answer assumed that your Oracle data was good, and your Sqoop job needed specific configuration to cope with NUMBER values.

    But now I suspect that your Oracle data contains shit, and specifically NaN values, as a result of calculation errors.
    See that post for example: When/Why does Oracle adds NaN to a row in a database table

    And Oracle even has distinct "Not-a-Number" categories to represent "infinity", to make things even more complicated.

    But on Java side, BigDecimal does not support NaN -- from the documentation, in all conversion methods...

    Throws:
      NumberFormatException - if value is infinite or NaN.

    Note that the JDBC driver masks that exception and displays NumericOverflow instead, to make things more complicated to debug...


    So your issue looks like that one: Solr Numeric Overflow (from Oracle)
    -- but unfortunately SolR allows to skip errors, while Sqoop does not; so you cannot use the same trick.

    In the end, you will have to "mask" these NaN values with Oracle function NaNVL, using a free-form query in Sqoop:

    $ sqoop import --query 'SELECT x, y, NANVL(z, Null) AS z FROM wtf WHERE $CONDITIONS'