I have below dates in my hive table :
Jan 2014
Oct-13
8-Nov
8-Oct
30-Nov-11
I need to convert them in the 'yyyy-MM-dd' format.
I have used from_unixtime(unix_timestamp(change_log_date ,'yyyyMMdd'), 'yyyy-MM-dd')
to covert date format which is working fine for 30-Nov-11 however since I have different date formats in the data so how to write generic code which will check date format and convert it into 'yyyy-MM-dd'.
I need to put 0 for day/month/year if its not present.
for eg. I need to convert 8-Oct into '0000-10-08'
need help
If you know all possible date format, you can use case statement to apply the correct format for each line:
CASE WHEN change_log_date rlike '[0-9]{2}-[a-zA-Z]{3}-[0-9]{2}'
THEN from_unixtime(unix_timestamp(change_log_date ,'yyyyMMdd'), 'yyyy-MM-dd')
WHEN change_log_date rlike --other regex
THEN -- and so on
If you are more familiar with java or python, you can easily write your piece of code and use it as Hive function.