Search code examples
sqlhivetimestamphiveqlunix-timestamp

Hive/SQL Error when converting milliseconds to DDDD:HH:mm:ss


I have an output in milliseconds that is too big to be described in HH:mm:ss format, I will need to expand to DDDD:HH:mm:ss.

The code I'm currently using only works on big numbers:

select from_unixtime(cast(floor(2513702864/1000) as bigint), 'DDDD:HH:mm:ss');

gives: 0030:02:15:02 , this is correct

select from_unixtime(cast(floor(17259/1000) as bigint), 'DDDD:HH:mm:ss');

gives: 0001:00:00:17 , this is not correct.

select from_unixtime(cast(floor(127259/1000) as bigint), 'DDDD:HH:mm:ss');

gives: 0001:00:02:07, this is also not correct.

How to fix the erroneous 1 in DDDD part when dealing with smaller milliseconds?


Solution

  • The logic is simple math. BIGINT timestamp is the number of seconds or milliseconds passed from Unix Epoch (1970-01-01 00:00:00 UTC).

    To get milliseconds part use (ts % 1000) - returns reminder after division by 1000

    To get total whole seconds passed, use (ts div 1000) - returns integer part, all other figures will be calculated from this number: days, hours, minutes, seconds.

    days: (ts div 1000) div 86400 - returns integer part after division of total seconds by number of seconds in a day

    To get hours left after whole days calculation: take reminder after days calculation ((ts div 1000) % 86400) and divide by number of seconds in hour, take integer part (((ts div 1000) % 86400) div 3600)

    And so on.

    Demo:

    with your_data as (
    select 1 id, bigint(2513702864) ts union all
    select 2, bigint(17259) union all
    select 3,bigint(127259) union all
    select 4,bigint(1272) union all
    select 5,bigint(127)
    )
    
    select --format output as required. For example days:hours:minutes:seconds.millis
           concat(days,':',hours,':',minutes,':',seconds,'.',millis)
    from
    (
    select ((ts div 1000) div 86400) days, --number of whole days
           lpad(((ts div 1000) % 86400) div 3600, 2, 0) hours, --whole hours left
           lpad((((ts div 1000) % 86400) % 3600) div 60, 2, 0) minutes, --whole minutes left
           lpad((((ts div 1000) % 86400) % 3600) % 60, 2, 0) seconds, --seconds left
           (ts % 1000) as millis
      from your_data
    )s
    

    Result:

    1 29:02:15:02.864 --29 whole days, 2 hours, 15 minutes, 2 seconds, 864 millis
    2 0:00:00:17.259 --17 whole seconds and 259 millis
    3 0:00:02:07.259 --two whole minutes, 7 seconds and 259 millis
    4 0:00:00:01.272 --one whole second and millis
    5 0:00:00:00.127 --we have only milliseconds
    

    Now you can see the difference between this calculation and what from_unixtime returns. For record id=1 the number of whole days is 29. Why from_unixtime returns 30 (for pattern 'D')? Because 29 whole days passed and we are 2 hrs 15 min 2 sec 864 mi in a new day 30. In other words, from_unixtime returns timestamp formatted and calculation in my query returns interval formatted, "day in a year" and "whole days passed from" are different things.

    Hope, now it is as clear as a day.

    See also similar question: https://stackoverflow.com/a/57497316/2700344

    And if you need to convert bigint timestamp in milliseconds to string with milliseconds preserved (yyyy-MM-dd HH:mm:ss.SSS) use this:

    select concat(from_unixtime(ts div 1000), '.', (ts  % 1000)) as timestamp_with_millis
      from (select bigint(2513702864)  as ts) s
    

    Result:

    1970-01-30 02:15:02.864