Opening text file in python as an array or list of list

I am trying to open a text file in python as an array or a list of list. The file looks like below.
Also, here is a link to the text file.
ftp://rammftp.cira.colostate.edu/demaria/ebtrk/ebtrk_atlc.txt

AL0188 ALBERTO   080518 1988 32.0  77.5  20 1015 -99 -99  -99 -99   0  0  0  0   0  0  0  0   0  0  0  0 *   218.  
AL0188 ALBERTO   080600 1988 32.8  76.2  20 1014 -99 -99  -99 -99   0  0  0  0   0  0  0  0   0  0  0  0 *   213.  
AL0188 ALBERTO   080712 1988 41.5  69.0  35 1002 -99 -99 1012  60 100100 50 50   0  0  0  0   0  0  0  0 *   118.  
AL0188 ALBERTO   080718 1988 43.0  67.5  35 1002 -99 -99 1008  50 100100 50 50   0  0  0  0   0  0  0  0 *   144.  
AL0188 ALBERTO   080800 1988 45.0  65.5  35 1004 -99 -99 1008  50 -99-99-99-99   0  0  0  0   0  0  0  0 *    22.  
AL0188 ALBERTO   080806 1988 47.0  63.0  35 1006 -99 -99 1008  50 -99-99-99-99   0  0  0  0   0  0  0  0 *    64.

I have tried using NumPy genfromtxt but it returned with an error, because it couldn't tell for example that 100100 is two elements in two columns. It treated it as one entry in a column, and so returned error saying the number of columns in each row didn't match.

Is there some way to fix this? Thank you

Solution

You can supply the delimiter sizes as argument. Example:

import numpy as np
import sys

with open('ebtrk_atlc.txt', 'rU') as f:
    data = np.genfromtxt(f,
                         dtype=None,
                         delimiter=[7, 10, 7, 4, 5, 6, 4, 5, 4, 4, 5, 4, 4, 3, 3, 3])
    print data

will give as output (omitting the first few lines)

('AL0188 ', 'ALBERTO   ', 80712, 1988, 41.5, 69.0, 35, 1002, -99, -99, 1012, 60, 100, 100, 50, 50)
('AL0188 ', 'ALBERTO   ', 80718, 1988, 43.0, 67.5, 35, 1002, -99, -99, 1008, 50, 100, 100, 50, 50)
('AL0188 ', 'ALBERTO   ', 80800, 1988, 45.0, 65.5, 35, 1004, -99, -99, 1008, 50, -99, -99, -99, -99)

As you see the 100100 field got separated. Of course you have to supply the correct field types and dimensions, this example just demonstrates that it is possible. For example, changing the code to

import numpy as np
import re
import sys

with open('ebtrk_atlc.txt', 'rU') as f:
    dt = "a7,a10,a7,i4,f5,f6,i4,i5,i4,i4,i5,i4,i4,i3,i3,i3"
    data = np.genfromtxt(f,
                         dtype=dt,
                         delimiter=map(int, re.split(",?[a-z]", dt[1:])),
                         autostrip=True)

will change the result to

('AL0188', 'ALBERTO', '080712', 1988, 41.5, 69.0, 35, 1002, -99, -99, 1012, 60, 100, 100, 50, 50)
('AL0188', 'ALBERTO', '080718', 1988, 43.0, 67.5, 35, 1002, -99, -99, 1008, 50, 100, 100, 50, 50)
('AL0188', 'ALBERTO', '080800', 1988, 45.0, 65.5, 35, 1004, -99, -99, 1008, 50, -99, -99, -99, -99)

Stripping away the whitespace around the strings and explicitly setting some types to float. Further documentation can be found here, check the example at the bottom.