Search code examples
pythonhive

How to convert different formats of date timestamp to the format of timestamp in hive table


I have a list of different formats of timestamp. How to change its for to the format accepted in hive tables. For eg. 20210811:12:55:56.563 to 2021-08-11 12:55:56.563 25/05/1999 02:35:05.532 to 1999-05-25 02:35:05.532 . How to do it in python. I have around 7-8 different formats.

Does anyone have any ideas or approach around it. Your ideas are most welcome.


Solution

  • You can use below py function to check the format. If below functions returns not none values, then its in expected format else no. Return value will be a date time in yyyy-MM-dd HH:MI:SS.SSSSS format. You can easily insert this into hive date time field.

    import datetime
    
    #formats to be checked
    fmts=['%d/%m/%Y %H:%M:%S.%f','%Y%m%d %H:%M:%S.%f','%d-%m-%Y %H:%M:%S.%f']
    
    #func to check the formats and return proper date time if its in correct format.
    def try_strptime(s, fmts=fmts):
        for fmt in fmts:
            try:
                return datetime.strptime(s, fmt)
            except:
                continue
    
        return None