Search code examples
sqlhivetimestamphiveqlmilliseconds

Casting DATE/TIMESTAMP types to NUMERIC is prohibited in Hive 3.1.3


I am trying to cast dates from string format to numeric format in milliseconds trying to keep also the .SSS part as I need to process data at the level of milliseconds duration. While in Hive 1.1.0 I am able to do that with the code below in the newer version it does not let me do that:

select current_timestamp(), unix_timestamp(current_timestamp(), 'yyyy-MM-dd HH:mm:ss.SSS')*1000, cast((cast(date_format(cast(current_timestamp() as string),'yyyy-MM-dd HH:mm:ss.SSS') as timestamp)) as double) * 1000 as time_milliseconds

Can you tell me a workaround to this?

Thank you


Solution

  • Extract millisecond part from string and add to the (timestamp in seconds)*1000

    select current_timestamp(),
    
    --unix_timestamp returns seconds only    
    unix_timestamp(current_timestamp())*1000, --without .SSS * 1000
    
    unix_timestamp(current_timestamp())*1000 +
    bigint(regexp_extract(string(current_timestamp()),'\\.(\\d+)$',1)) --with .SSS
    

    Result:

    2021-09-21 13:52:32.034  1632232352000  1632232352034
    

    Explicit conversion to bigint and string may be not necessary.

    One more method how you can get milliseconds part is to split string by dot and get element #1: split(current_timestamp(),'\\.')[1] instead of regexp_extract(string(current_timestamp()),'\\.(\\d+)$',1):

    select ts,  unix_timestamp(ts_splitted[0])*1000, unix_timestamp(ts_splitted[0]) * 1000 + ts_splitted[1]
    from
    (
    select current_timestamp() ts, split(current_timestamp(),'\\.') ts_splitted
    )s
    

    Result:

    2021-09-21 18:21:11.032   1632248471000  1632248471032
    

    I prefer this method. Of course if you have timestamps with microseconds or nanoseconds, the logic should be adjusted accordingly based on the length of the fractional part.