Search code examples
hadoophiveqlimpala

Impala converting input time to null during daylight savings switch


Came across a strange issue where in when converting a timestamp from PST to GMT, to_utc_timestamp function is returning blank. I could see that this happening for an hour range only on the day daylight savings were switched off in US.

Query: select to_utc_timestamp(cast('2017-11-05 01:00:00' as timestamp),'PST')

Query: select to_utc_timestamp(cast('2017-11-05 01:59:59' as timestamp),'PST')

Above queries return blank output.

However the same things works fine in hive:

Query: select to_utc_timestamp(cast('2017-11-05 01:00:00' as timestamp),'PST')
OK
2017-11-05 09:00:00
Query: select to_utc_timestamp(cast('2017-11-05 01:59:59' as timestamp),'PST')
OK
2017-11-05 09:59:59

Need help understanding reason for the same and also how to work around this issue using Impala query itself.

Impala Version - v2.7.0 on CDH 5.10

Hive version - 1.1.0 on CDH 5.10


Solution

  • I'm using the same Impala and CDH versions as you, and the same bug presents. This workaround would get the right answer in Impala though:

    select case 
        when 
            cast('2017-11-05 01:00:00' as timestamp) >= '2017-11-05 01:00:00' and 
            cast('2017-11-05 01:00:00' as timestamp) <= '2017-11-05 01:59:59' 
        then 
            hours_sub(to_utc_timestamp(hours_add('2017-11-05 01:00:00', 1), 'PST'), 1)
        else 
            to_utc_timestamp(cast('2017-11-05 01:00:00' as timestamp), 'PST') end;