Search code examples
pythonpandasmatplotlibdataframegenfromtxt

Reading dates and data from a file (Python)


I'd wanna to read time strings and data from a file but when I used loadtxt i cant read string and numbers at the same time because strings are not float. So i tried using genfromtxt and use delimiter=[]+[]+[] acording the columns that I have, but the string are read like nan. i'd like read the time directly like a time array (date2num, datetime or similar) to be able to plot in matplotlib in the correct form. So, what can i do? I leave a mi list below (Obviously, it's more data):

GOES data for time interval: 20-Feb-2014 00:00:00.000 to 27-Feb-2014 00:00:00.000
Current time: 23-Mar-2014 21:52:00.00

Time at center of bin        1.0 - 8.0 A    0.5 - 4.0 A  Emission Meas           Temp
                              watts m^-2     watts m^-2    10^49 cm^-3             MK
20-Feb-2014 00:00:00.959     4.3439e-006    3.9946e-007        0.30841         10.793
20-Feb-2014 00:00:02.959     4.3361e-006    3.9835e-007        0.30801         10.789
20-Feb-2014 00:00:04.959     4.3413e-006    3.9501e-007        0.30994         10.743
20-Feb-2014 00:00:06.959     4.3361e-006    3.9389e-007        0.30983         10.735
20-Feb-2014 00:00:08.959     4.3361e-006    3.9278e-007        0.31029         10.722
20-Feb-2014 00:00:10.959     4.3387e-006    3.9278e-007        0.31058         10.719
20-Feb-2014 00:00:12.959     4.3361e-006    3.9278e-007        0.31029         10.722
20-Feb-2014 00:00:14.959     4.3361e-006    3.9055e-007        0.31122         10.695
20-Feb-2014 00:00:16.959     4.3334e-006    3.8721e-007        0.31234         10.657

Following the suggestions, I read the data using:

pd.read_csv('/filename',sep='\s\s+',header=5,
               names=['time','band1','band2','emeas','temp'])

and I got read the data, but just a problem, when I print the data appears:

                       time     band1  band2    emeas    temp
0  20-Feb-2014 00:00:03.005  0.000004      0  0.31000  10.866
1  20-Feb-2014 00:00:05.052  0.000004      0  0.31199  10.819
2  20-Feb-2014 00:00:07.102  0.000004      0  0.31190  10.811
3  20-Feb-2014 00:00:09.149  0.000004      0  0.31237  10.798
4  20-Feb-2014 00:00:11.199  0.000004      0  0.31266  10.795
5  20-Feb-2014 00:00:13.245  0.000004      0  0.31237  10.798
6  20-Feb-2014 00:00:15.292  0.000004      0  0.31334  10.770
7  20-Feb-2014 00:00:17.342  0.000004      0  0.31451  10.732
8  20-Feb-2014 00:00:19.389  0.000004      0  0.31451  10.732
9  20-Feb-2014 00:00:21.439  0.000004      0  0.31421  10.735

So, apparently the data of band1 and band2 have been rounded. Actually, when plotting it appears to be correct (non rounded), but why look like that in the frame.


Solution

  • You can use pandas.read_csv() because the sep parameter (equivalend to the delimiter in numpy.genfromtxt) accepts regular expressions. Then, with:

    import pandas as pd
    
    pd.read_csv('test.txt', sep='\s\s+', header=4)
    

    you will get the desired output.