Search code examples
apache-pigudf

Apache Pig ToDate UDF Timestamp format


I am using ToDate UDF in pig for generating a datetime field. Input is in yyyy-MM-dd format. ToDate(sch_trans_dt,'yyyy-MM-dd','Etc/GMT+7') is generating the value with a colon in timestamp field as 2015-11-26T00:00:00.000-07:00 Is there a way to avoid the colon in the timestamp and make the generated value as 2015-11-26T00:00:00.000-0700


Solution

  • Ref : http://pig.apache.org/docs/r0.12.0/func.html#to-string

    Return type of ToDate function is DateTime object (ISO 8601 format). To convert this to customized string format we can use ToString function by giving required format string as the second parameter.

    Pig Script :

    A = LOAD 'input.csv' AS (datestring:chararray);
    B = FOREACH A GENERATE ToString(ToDate(datestring,'yyyy-MM-dd','Etc/GMT+7'),'yyyy-MM-dd\'T\'hh:ss:mm.SZ');
    DUMP B;
    

    Input :

    2015-11-26
    

    Output :

    (2015-11-26T12:00:00.0-0700)