Search code examples
pythondata-analysis

Python converting text data from dat file to int


I'm trying to write a simple plot program for my company. I have a .dat file with data and to read it I do:

with open(r'XXX\DAT-010.DAT', 'r') as f:
    data = f.readlines()
print(data)

Result:

 ['      Date      Time    Elapsed    Sensor1    Sensor2    Sensor3    Sensor4    Sensor5    Sensor6    Sensor7    Sensor8    Sensor9   Sensor10   Sensor11   Sensor12   Sensor13   Sensor14   Sensor15   Sensor16   Sensor17   Sensor18   Sensor19   Sensor20\n',
 'dd/mm/yyyy  hh:mm:ss    Seconds JGP1103-I2 JGP1102-I2 JGP1102-I1    JGP1001    JGP1101   FLOW_416   FLOW_333  FLOW_2945     L1_INJ     L2_INJ     L3_INJ     L4_INJ     L1_EXT     L2_EXT     L3_EXT     L4_EXT L1_Mth_ext L2_Mth_ext L3_Mth_ext L4_Mth_ext\n',
 '         -         -          -        kPa        kPa        kPa        kPa        kPa     ml/min     ml/min     ml/min         mV         mV         mV         mV         mV         mV         mV         mV         mV         mV         mV         mV\n',
 '         -         -          -          -          -          -          -          -          -          -          -          -          -          -          -          -          -          -          -          -          -          -          -\n',
 '----------  --------  --------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------\n',
 '26.10.2016  08:58:09    1211242      84.95      84.77      86.21      84.47      84.77    -104.78     -83.82          -          -          -          -          -          -          -          -          -          -          -          -          -\n',
 '26.10.2016  08:58:24    1211257      85.01      84.77      86.03      84.53      84.77    -104.78     -83.82          -          -          -          -          -          -          -          -          -          -          -          -          -\n']

Now, to find the actual values, I'm doing:

data_int = list(map(float, data))

and I get the following error:

ValueError: could not convert string to float: '      Date      Time    Elapsed    Sensor1    Sensor2    Sensor3    Sensor4    Sensor5    Sensor6    Sensor7    Sensor8    Sensor9   Sensor10   Sensor11   Sensor12   Sensor13   Sensor14   Sensor15   Sensor16   Sensor17   Sensor18   Sensor19   Sensor20\n'

So I did:

data_int = list(map(float, data[6]))

To try it on a line which should only contains actual data values, and I got:

ValueError: could not convert string to float: '.'

Now, how can I efficiently convert this data to a analyzable list of values? How can I convert the txt data to integers? For the record, I tried int(data) etc and it didn't work.


Solution

  • Each line of your files is a string - so row[5] contains:

    "26.10.2016 08:58:09 1211242 84.95 84.77 86.21 84.47 84.77 -104.78 -83.82 - - - - - - - - - - - - -\n" 
    

    You can not convert that to a float. You need to

    • split the line into parts
    • convert the part according to their data typ

    line = "26.10.2016 08:58:09 1211242 84.95 84.77 86.21 84.47 84.77 -104.78 -83.82 - - - - - - - - - - - - -\n"
    
    def tryFloat(text):
        """Returns either the float(text) or text itself."""
        try:
            return float(text)
        except:
            return text
    
        # strip() removes \n and other witespaces front and end
        # split splits at whitespaces combining multiple into one
    data  = list(map(tryFloat,line.strip().split()))
    
    print(data)
    

    Output (a list of values):

    ['26.10.2016', '08:58:09', 1211242.0, 84.95, 84.77, 86.21, 84.47, 84.77, -104.78, 
     -83.82, '-', '-', '-', '-', '-', '-', '-', '-', '-', '-', '-', '-', '-']
    

    You can then pick parts from that list:

    print(data[3])  # 84.95