I have a date column as a string data type in MMMM Do YYYY, HH:mm:ss.SSS
(December 16th 2019, 21:30:22.000
) format.
I'm trying to convert this into a timestamp data type in hive but couldn't able to achieve it because this format is not available in unixtime.
Is there any way to convert this in hive?
This method will preserve millisecond precision. First extract only parts compatible with SimpleDateFormat pattern using regex, then convert to datetime, concat with milliseconds (milliseconds lost after unix_timestamp conversion) and convert to timestamp:
select timestamp(concat(from_unixtime(unix_timestamp(dt,'MMM dd yyyy HH:mm:ss.SSS')),'.',split(dt,'\\.')[1]))
from
(select regexp_replace('December 16th 2019, 21:30:22.001','([A-Za-z]+ \\d{1,2})[a-z]{0,2} (\\d{4}), (\\d{2}:\\d{2}:\\d{2}\\.\\d+)','$1 $2 $3') as dt --returns December 16 2019 21:30:22.001
) s;
OK
2019-12-16 21:30:22.001
Time taken: 0.09 seconds, Fetched: 1 row(s)