python loops file-io numpy post-processing

Python: read timesteps from csv to arrays: Post-processing model-data with numpy;

I am still trying to come around with python, but this problem exceeds my knowledge:

Topic: hydrodynamic postprocessing: csv output of hydraulic software to array, split timesteps

Here is the data and how far i came with a working code:

Input-file (see below):

First row: Number of result-nodes

Second row: Header

Third row: timestep @ time=

Following: all results of this timestep (in this file: 13541 nodes, variable) ....the same again for next timestep.

# Number of Nodes: 13541
#X                  Y                   Z                   depth               wse             
# Output at t = 0
       5603.7598           4474.4902           37.470001                   0           37.470001
          5610.5           4461.6001           36.020001                   0           36.020001
         5617.25             4448.71           35.130001                   0           35.130001
       5623.9902           4435.8198               35.07                   0               35.07
       5630.7402           4422.9199               35.07                   0               35.07
       5761.5801             4402.79           35.369999                   0           35.369999
COMMENT:....................13541 timesteps...........
# Output at t = 120.04446
       5603.7598           4474.4902           37.470001           3.6977223           41.167724
          5610.5           4461.6001           36.020001           4.1377293            40.15773
         5617.25             4448.71           35.130001           3.9119012           39.041902
       5623.9902           4435.8198               35.07           3.7923947           38.862394
       5630.7402           4422.9199               35.07            3.998436           39.068436
       5761.5801             4402.79           35.369999           3.9750571           39.345056
COMMENT:....................13541 timesteps...........
# Output at t = 240.06036
       5603.7598           4474.4902           37.470001           11.131587           48.601588
          5610.5           4461.6001           36.020001           12.564266           48.584266
         5617.25             4448.71           35.130001           13.498463           48.628464
       5623.9902           4435.8198               35.07           13.443041           48.513041
       5630.7402           4422.9199               35.07           11.625824           46.695824
       5761.5801             4402.79           35.369999            19.49551           54.865508

PROBLEM: I need a loop, which reads in n-timesteps into arrays.

The result should be: array for each timestep: in this case 27 timesteps with 13541 elements each.

timestep_1=[all elements of this timestep: shape=13541,5]

timestep_2=[]

timestep_3[]

........

timestep_n=[]

My code so far:

 import numpy as np
 import csv
 from numpy import *
 import itertools

 #read file to big array
 array=np.array([row for row in csv.reader(open("ascii-full.csv", "rb"), delimiter='\t')])      
 firstRow=array[0]
 secondRow=array[1]

 # find out how many nodes
 strfirstRow=' '.join(map(str,firstRow))
 first=strfirstRow.split()
 print first[4]
 nodes=first[4]
 nodes=float(nodes)

 #count timesteps
 temp=(len(array)-3)/nodes           
 timesteps=int(temp)+1

 #split array into timesteps:
 # X Y Z h(t1) h(t2) h(tn)

 ts1=array[3:nodes+3]#13541
 #print ts1

 ts2=array[nodes+4:nodes*2+4]
 #print ts2


 .......
 read ts3 to last timestep to arrays with loop....

Maybe someone can help me, thanks!!!

Solution

My take on your problem is, instead of reading the whole file into an array and process the array, read it line by line, creating the arrays as the data is read.

I read the number of rows and columns per timestep as described in the file, then create a new array for each timestep read (adding it to a list), then populating it with the read data.

import numpy as np

timesteps = []
timestep_results = []

f = open("ascii-full.csv", "rb")

# First line is number of rows (not counting the initial #)
rows = int(f.readline().strip()[1:].split()[-1])
counter = 0

# Second line is number of columns
columns = len(f.readline().strip().split())

# Next lines
for line in f:
    if line.startswith("#"):
        # it's a header: add time to timestep list, begin new array
        timesteps.append( float(line.strip().split("=")[1]) )
        timestep_results.append( np.zeros((rows, columns)) )
        counter = 0
    else:
        # it's data: add to array in appropiate row
        timestep_results[-1][counter] = map(float, line.strip().split())
        counter += 1

f.close()

Hope it helps!