Search code examples
hdfshadoop2apache-drill

TO_DATE usage in Apache Drill to convert UNIXTIMESTAMP to human readable time format


My JSON file a timestamp field which is in UNIX upload timestamp format like 1501580484655. This converts to GMT: 01-08-2017 9:41:24.655 AM

My attempt to convert this timestamp to human readable time format till millisecond is not working.

I have tried below till now:

1- select TO_DATE(1501580484655) from (VALUES(1));

O/P is 2017-08-01 which is OK till date but I want till millisecond.

2- select TO_DATE(1501580484655,'yyyy-MM-dd HH:mm:ss.SSS') from (VALUES(1));

O/P is : Error: SYSTEM ERROR: IllegalArgumentException: Invalid format: "1501580484655" is malformed at "4655"

3- select TO_DATE(1501580484,'yyyy-MM-dd HH:mm:ss.SSS') from (VALUES(1));

O/P is : Error: SYSTEM ERROR: IllegalArgumentException: Invalid format: "1501580484" is malformed at "4"

4- select TO_DATE(150158048,'yyyy-MM-dd HH:mm:ss.SSS') from (VALUES(1));

O/P is : Error: SYSTEM ERROR: IllegalArgumentException: Invalid format: "150158048" is too short

5- SELECT TO_TIMESTAMP(1501580484655) from (VALUES(1));

This works though. O/P is 49553-03-11 18:10:55.0. And when I remove the last three characters 655 then I get O/P as 2017-08-01 09:41:24.0

But When I run this query on my upload timestamp column from json file, then I get:

5a - select TO_TIMESTAMP(SUBSTR((cast(UploadTimeStamp as INTEGER)),1,10)) from dfs.root./test/limit 5;

O/P is:

`Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to materialize incoming schema.  Errors:

Error in expression at index -1.  Error: Missing function implementation: [castTINYINT(VARCHAR-OPTIONAL)].  Full expression: --UNKNOWN EXPRESSION--..

Fragment 1:3

[Error Id: 7a91a9f8-9776-4f0d-9b9f-da3e65cc282a on rjio-devcluster1-BigdataNode1:31010] (state=,code=0)

Tried casting to various data types but same error is coming. Tried without cast and same error comes.

What should be the correct query to get O/P as 2017-08-01 HH:mm:ss.SSS


Solution

  • To see millis in the output timestamp value you can specify floating number as a parameter for to_timestamp() function:

    0: jdbc:drill:zk=local> SELECT TO_TIMESTAMP(1427936330) FROM (VALUES(1));
    +------------------------+
    |         EXPR$0         |
    +------------------------+
    | 2015-04-02 00:58:50.0  |
    +------------------------+
    1 row selected (1.612 seconds)
    0: jdbc:drill:zk=local> SELECT TO_TIMESTAMP(1427936330.456) FROM (VALUES(1)); 
    +--------------------------+
    |          EXPR$0          |
    +--------------------------+
    | 2015-04-02 00:58:50.456  |
    +--------------------------+