I have a list of different formats of timestamp. How to change its for to the format accepted in hive tables. For eg. 20210811:12:55:56.563 to 2021-08-11 12:55:56.563 25/05/1999 02:35:05.532 to 1999-05-25 02:35:05.532 . How to do it in python. I have around 7-8 different formats.
Does anyone have any ideas or approach around it. Your ideas are most welcome.
You can use below py function to check the format. If below functions returns not none values, then its in expected format else no. Return value will be a date time in yyyy-MM-dd HH:MI:SS.SSSSS
format. You can easily insert this into hive date time field.
import datetime
#formats to be checked
fmts=['%d/%m/%Y %H:%M:%S.%f','%Y%m%d %H:%M:%S.%f','%d-%m-%Y %H:%M:%S.%f']
#func to check the formats and return proper date time if its in correct format.
def try_strptime(s, fmts=fmts):
for fmt in fmts:
try:
return datetime.strptime(s, fmt)
except:
continue
return None