I have a machine data set I need to analyze but in parsing an excel file from the line controller PC, the date info (which starts out formatted like "10/09/2020") ends up as what I assume is a unix time stamp that's longer than I've ever seen or can find an example of on SO (e.g. 1599696000000000000).
Efforts I have made to covert it have failed. I am doing a python coursera program and the lecturer did say it is a difficult proposition as there's no standard way and none that I have tried have worked
Anyway, the excel file looks like this
Datum WKZ_code Time Rad_t1 Not Important Cut_Event
1599696000000000000 W03 00:00:00 100 250 /
1599696000000000000 W03 00:00:01 100 250 /
1599696000000000000 W03 00:00:02 100 250 /
1599696000000000000 W03 00:00:03 100 250 /
1599696000000000000 W03 00:00:04 100 250 /
1599696000000000000 W03 00:00:00 100 250 Speed Cut
I need to be able to bring the Datum column back to "DD/MM/YYYY" format otherwise my plots of the cut data fail as python seems to be unable to determine where to put them on a y axis.
What I tried that appeared the neatest solution was
pd.Timestamp.combine(pd.Timestamp.utcfromtimestamp(df['DATUM'][0]/1E9), pd.to_datetime(df['Time'][0]))
but that produced the error
TypeError Traceback (most recent call last)
<ipython-input-102-b9d1b1186db5> in <module>
----> 1 pd.Timestamp.combine(pd.Timestamp.utcfromtimestamp(df['DATUM'][0]/1E9), pd.to_datetime(df['Time'][0]))
pandas\_libs\tslibs\timestamps.pyx in pandas._libs.tslibs.timestamps.Timestamp.combine()
TypeError: combine() argument 2 must be datetime.time, not Timestamp
A python developer in another department suggested:
df = pd.read_excel("tool W03 with cuts and dates.xlsx",
dtype=object)
df = df[['DATUM']]
df = df['DATUM']=(pd.to_datetime(df['DATUM'],unit='s'))
df.head()
Which also threw an error around the unit, but I have been unable to come up with an a cure
pandas\_libs\tslib.pyx in pandas._libs.tslib.array_with_unit_to_datetime()
OutOfBoundsDatetime: cannot convert input 1599696000000000000 with the unit 's
If anyone can provide a solution i'd be very, very grateful. I've tried hard to keep to conventions with asking this, apologies in advance for errors.
Use to_datetime
with default unit='ns'
and convert times to timedeltas by to_timedelta
:
df['Datum'] = pd.to_datetime(df['Datum']) + pd.to_timedelta(df['Time'])
print (df)
Datum WKZ_code Time Rad_t1 Not Important Cut_Event
0 2020-09-10 00:00:00 W03 00:00:00 100 250 /
1 2020-09-10 00:00:01 W03 00:00:01 100 250 /
2 2020-09-10 00:00:02 W03 00:00:02 100 250 /
3 2020-09-10 00:00:03 W03 00:00:03 100 250 /
4 2020-09-10 00:00:04 W03 00:00:04 100 250 /
5 2020-09-10 00:00:00 W03 00:00:00 100 250 Speed Cut