Search code examples
hadoopclouderaimpalahuebigdata

Casting STRING to DATE in Impala. (Possible bugs)


This seems to be an interesting bug/problem I am facing. I am using Impala and HUE which comes under CDH 5.8 (Cloudera Distribution for Hadoop).

On executing the below code

select '1709.02.02' as DateString, CAST((from_unixtime(UNIX_TIMESTAMP('1709.02.02','yyyy.MM.dd'))) as TIMESTAMP) as DateTimestamp

I get output as follows (which is expected)

datestring  datetimestamp
1709.02.02  1709-02-02 00:00:00

But on executing the code below

select '1009.02.02' as DateString, CAST((from_unixtime(UNIX_TIMESTAMP('1009.02.02','yyyy.MM.dd'))) as TIMESTAMP) as DateTimestamp

The output displayed is as follows (which is not expected):

datestring  datetimestamp
1009.02.02  NULL

Can someone kindly shed a light as to why this is happening and/or suggest a possible way-around?

PS: It seems to work fine if the above query is run in Hive. But in Impala we are getting NULL values.


Solution

  • Impala only supports TIMESTAMPs within the years 1400 to 9999, so any TIMESTAMP values outside of those ranges will be NULL.

    https://www.cloudera.com/documentation/enterprise/latest/topics/impala_timestamp.html#timestamp