I am trying to import a csv into panda. This csv contains a data column as follow (for instance):
20170607155749330: (Year/Month/Day/Hour/Minute/Second/decisecond/centisecond/millisecond). So there are 17 numbers representing the date.
Issue I have is that for this particular dataset is that some row have dates at a different precision, for instance: 2017060715581995 : here there are only 16 numbers since the millisecond is missing.
What I want to do is simple: converting every date for every row in a way that I would end up with 17 digit, by adding a 0 when one of them is missing.
Here is the function I wrote for this:
def convert_value(val):
return ('{:0<17}'.format(val))
So I try to apply it to every row by just creating a new column such as:
file_temp['Time_Transformed'] = file_temp.apply( lambda x: convert_value(x["TIME"]) , axis =1)
And I am facing two issues:
I tried various other approaches by converting the original data to string but they are not working either.
I think you need astype
with str.ljust
:
df = pd.DataFrame({'A':[20170607155749330,20170607155749]})
df['new'] = df['A'].astype(str).str.ljust(17, '0').astype('int64')
print (df)
A new
0 20170607155749330 20170607155749330
1 20170607155749 20170607155749000
And for datetime:
df['date'] = pd.to_datetime(df['A'].astype(str).str.ljust(17, '0'), format='%Y%m%d%M%S%f')
print (df)
A date
0 20170607155749330 2017-06-07 00:15:57.493300
1 20170607155749 2017-06-07 00:15:57.490000
is same as:
df['date'] = pd.to_datetime(df['A'], format='%Y%m%d%M%S%f')
print (df)
A date
0 20170607155749330 2017-06-07 00:15:57.493300
1 20170607155749 2017-06-07 00:15:57.490000