I have rows with the following format and I would like to transform then into valid Hive timestamps. Format in my data:
28/04/2017 00:00:00|20550|22/05/2017 00:00:00|
I'm only interested in the first and third column, separated with |, in MY case the format is, then:
dd/MM/yy HH:mm:ss
I've discovered this can't be used as timestamp in Hive.
I find myself unable to transform all that first and third column to the proper format using queries similar to:
select from_unixtime(unix_timestamp('28/04/2017','dd/MM/yy HH:mm:ss'),'yyyy-MM-dd') from `20170428_f_pers_pers`
I'm trying different instances of that query but since I can't access the documentation (internet is capped here at work), I can't see how to properly use this two functions, from_unixtime
and unix_timestamp
I've made the following assumptions:
I do not care about doing both columns at once, but right now when I do the query showed first I get as many nulls as data has my table, and I'm unsure my assumptions are even partially right since every example I come accross is simpler (they do not change days and years arround, for instance).
I would like to know how to apply the query to a specific column, since I haven't understood how to do that from the examples studied so far. I do not see them using any type of column ID for that, which is weird to me, using data from the column to change the column itself.
Thanks in advance.
edit: I am now trying something like
select from_unixtime(unix_timestamp(f_Date, 'dd/MM/yyyy HH:mm:ss')) from `myTable`
But I get from HUE the following error:
Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
The format should be completely covered by the input string.
In other words -
The format can be equal in length to the the input string or shorter, but not longer.
28/04/2017 00:00:00
|||||||||||||||||||
dd/MM/yyyy HH:mm:ss
select from_unixtime(to_unix_timestamp('28/04/2017 00:00:00', 'dd/MM/yyyy HH:mm:ss'))
2017-04-28 00:00:00
28/04/2017 00:00:00
||||||||||
dd/MM/yyyy
select from_unixtime(to_unix_timestamp('28/04/2017 00:00:00', 'dd/MM/yyyy'))
2017-04-28 00:00:00
The result can be converted from string to timestamp using cast
select cast (from_unixtime(to_unix_timestamp('28/04/2017 00:00:00', 'dd/MM/yyyy HH:mm:ss')) as timestamp)