Search code examples
pythonfile-iolarge-files

Smart way to read big input file with multiple unmarked variables (assorted in columns) in python


I have the following code that runs for over a million lines. But this takes a lot of time. Is there a better way to read in such files? The current code looks like this:

for line in lines:
    line = line.strip()             #Strips extra characters from lines
    columns = line.split()          #Splits lines into individual 'strings'
    x = columns[0]                  #Reads in x position
    x = float(x)                    #Converts the strings to float
    y = columns[1]                  #Reads in y 
    y = float(y)                    #Converts the strings to float
    z = columns[2]                  #Reads in z 
    z = float(z)                    #Converts the strings to float

The file data looks like this:

  347.528218024     354.824474847   223.554247185   -47.3141937738  -18.7595743981   
  317.843928028     652.710791858   795.452586986   -177.876355361  7.77755408015   
  789.419369714     557.566066378   338.090799912   -238.803813301  -209.784710166   
  449.259334688     639.283337249   304.600907059   26.9716202117   -167.461497735  
  739.302109761     532.139588049   635.08307865    -24.5716064556  -91.5271790951  

I want to extract each number from different columns. Every element in a column is the same variable. How do I do that? For example I want a list, l, say to store the floats of first column.


Solution

  • It would be helpful to know what you are planning on doing with the data, but you might try:

    data = [map(float, line.split()) for line in lines]
    

    This will give you a list of lists with your data.