Search code examples
google-cloud-data-fusioncdap

CDAP Trucate a long type column representing an epoch in milliseconds to year/month


I'm using CDAP and Cloud Data Fusion 6.6.0 I have a column called ts with a long representing a timestamp in milliseconds. The type is long.

In the next step, I need to group by year and month, so I need to create two new fields year and month or truncate ts to month.

What is the set of Wrangler directives to do it? The next example is not working because it is a long so I suppose that I need to transform ts to ZoneDateTime, but no idea how to do it:

Tested directives:

parse-as-simple-date:ts
set-column year date:YEAR(ts)
set-column month date:MONTH(ts)

Error

Improper usage of directive 'parse-as-simple-date', usage - 'parse-as-simple-date :column 'format''

Solution

  • These are the wrangler directives to use:

    • timestamp-to-datetime
    • parse-timestamp

    The directive that you are currently using needs to have a date format such as yyyy.MM.dd it would be something like: parse-as-simple-date :ts yyyy.MM.dd but this function parse a DATE STRING to a DATE.

    The wrangler directive timestamp-to-datetime converts a timestamp to a datetime.

    timestamp-to-datetime :ts
    

    Another possible solution is to use parse-timestamp this directive parses column values representing unix timestamp as date and then you could format the result to a DATE pattern that you like.

    parse-timestamp :ts milliseconds
    format-date :ts 'MM/dd/yyyy'