I am trying to cast dates from string format to numeric format in milliseconds trying to keep also the .SSS part as I need to process data at the level of milliseconds duration. While in Hive 1.1.0 I am able to do that with the code below in the newer version it does not let me do that:
select current_timestamp(), unix_timestamp(current_timestamp(), 'yyyy-MM-dd HH:mm:ss.SSS')*1000, cast((cast(date_format(cast(current_timestamp() as string),'yyyy-MM-dd HH:mm:ss.SSS') as timestamp)) as double) * 1000 as time_milliseconds
Can you tell me a workaround to this?
Thank you
Extract millisecond part from string and add to the (timestamp in seconds)*1000
select current_timestamp(),
--unix_timestamp returns seconds only
unix_timestamp(current_timestamp())*1000, --without .SSS * 1000
unix_timestamp(current_timestamp())*1000 +
bigint(regexp_extract(string(current_timestamp()),'\\.(\\d+)$',1)) --with .SSS
Result:
2021-09-21 13:52:32.034 1632232352000 1632232352034
Explicit conversion to bigint and string may be not necessary.
One more method how you can get milliseconds part is to split string by dot and get element #1: split(current_timestamp(),'\\.')[1]
instead of regexp_extract(string(current_timestamp()),'\\.(\\d+)$',1)
:
select ts, unix_timestamp(ts_splitted[0])*1000, unix_timestamp(ts_splitted[0]) * 1000 + ts_splitted[1]
from
(
select current_timestamp() ts, split(current_timestamp(),'\\.') ts_splitted
)s
Result:
2021-09-21 18:21:11.032 1632248471000 1632248471032
I prefer this method. Of course if you have timestamps with microseconds or nanoseconds, the logic should be adjusted accordingly based on the length of the fractional part.