I have a data file that has values in it like this:
@ DD MM YYYY HH MN SS Hs Hrms Hma x Tz Ts Tc THmax EP S T0 2 Tp Hrms EPS
29 11 2000 13 17 56 2.44 1.71 3.12 9.12 11.94 5.03 12.74 .83 8.95 15.03 1.80 .86
29 11 2000 13 31 16 2.43 1.74 4.16 9.17 11.30 4.96 11.70 .84 8.84 11.86 1.80 .87
I use the following to get the data in:
infile = open ("testfile.txt", 'r')
data = np.genfromtxt(infile,skiprows=2)
which gives me a numpy.ndarray
I want to be able to interpret the first 0-5 columns as a timestamp (DD:MM:YYY:HH:MN:SS), but this is where I get stumped - there seems to be a million ways to do it and I don't know what's best.
I've been looking at dateutil and pandas - I know there is something blindingly obvious I should do, but am at a loss. Should I convert to a csv format first? Somehow concatenate the values from each row (cols 0-5) using a for loop?
After this I'll plot values from other columns against the timestamps/deltas.
I'm totally new to python, so any pointers appreciated :)
Here's a pandas
solution for you:
test.csv:
29 11 2000 13 17 56 2.44 1.71 3.12 9.12 11.94 5.03 12.74 .83 8.95 15.03 1.80 .86
29 11 2000 13 31 16 2.43 1.74 4.16 9.17 11.30 4.96 11.70 .84 8.84 11.86 1.80 .87
pandas
provide a read_csv util for reading the csv, you should give the following parameters to parse your file:
dateutil.parser.parse
, but seems it doesn't work for your case, so you should implement your own parserNone
Finally, here the sample code:
In [131]: import datetime as dt
In [132]: import pandas as pd
In [133]: pd.read_csv('test.csv',
parse_dates=[[2,1,0,3,4,5]],
date_parser=lambda *arr:dt.datetime(*[int(x) for x in arr]),
delimiter=' ',
header=None)
Out[133]:
2_1_0_3_4_5 6 7 8 9 10 11 12 13 14 \
0 2000-11-29 13:17:56 2.44 1.71 3.12 9.12 11.94 5.03 12.74 0.83 8.95
1 2000-11-29 13:31:16 2.43 1.74 4.16 9.17 11.30 4.96 11.70 0.84 8.84
15 16 17
0 15.03 1.8 0.86
1 11.86 1.8 0.87