Search code examples
pythonpandasdatedate-conversion

Pandas: Adding 0 to a long integer representing a date


I am trying to import a csv into panda. This csv contains a data column as follow (for instance):

20170607155749330: (Year/Month/Day/Hour/Minute/Second/decisecond/centisecond/millisecond). So there are 17 numbers representing the date.

Issue I have is that for this particular dataset is that some row have dates at a different precision, for instance: 2017060715581995 : here there are only 16 numbers since the millisecond is missing.

What I want to do is simple: converting every date for every row in a way that I would end up with 17 digit, by adding a 0 when one of them is missing.

Here is the function I wrote for this:

def convert_value(val):
   return ('{:0<17}'.format(val))

So I try to apply it to every row by just creating a new column such as:

file_temp['Time_Transformed'] = file_temp.apply( lambda x: convert_value(x["TIME"]) , axis =1)

And I am facing two issues:

  1. Firstly, the newly created column is not of integer but of object type;
  2. Secondly, it just does not work, 0 is not added to shorter date.

I tried various other approaches by converting the original data to string but they are not working either.


Solution

  • I think you need astype with str.ljust:

    df = pd.DataFrame({'A':[20170607155749330,20170607155749]})
    
    df['new'] = df['A'].astype(str).str.ljust(17, '0').astype('int64')
    print (df)
                       A                new
    0  20170607155749330  20170607155749330
    1     20170607155749  20170607155749000
    

    And for datetime:

    df['date'] = pd.to_datetime(df['A'].astype(str).str.ljust(17, '0'), format='%Y%m%d%M%S%f')
    print (df)
                       A                       date
    0  20170607155749330 2017-06-07 00:15:57.493300
    1     20170607155749 2017-06-07 00:15:57.490000
    

    is same as:

    df['date'] = pd.to_datetime(df['A'], format='%Y%m%d%M%S%f')
    print (df)
                       A                       date
    0  20170607155749330 2017-06-07 00:15:57.493300
    1     20170607155749 2017-06-07 00:15:57.490000