Search code examples
pythonpandasdataframeeye-tracking

iteratively read (tsv) file for Pandas DataFrame


I have some experimental data which looks like this - http://paste2.org/YzJL4e1b (too long to post here). The blocks which are separated by field name lines are different trials of the same experiment - I would like to read everything in a pandas dataframe but have it bin together certain trials (for instance 0,1,6,7 taken together - and 2,3,4,5 taken together in another group). This is because different trials have slightly different conditions and I would like to analyze the results difference between these conditions. I have a list of numbers for different conditions from another file.

Currently I am doing this:

tracker_data = pd.DataFrame
tracker_data = tracker_data.from_csv(bhpath+i+'_wmet.tsv', sep='\t', header=4)
tracker_data['GazePointXLeft'] = tracker_data['GazePointXLeft'].astype(np.float64) 

but this of course just reads everything in one go (including the field name lines) - it would be great if I could nest the blocks somehow which allows me to easily access them via numeric indices...

Do you have any ideas how I could best do this?


Solution

  • You should use read_csv rather than from_csv*:

    tracker_data = pd.read_csv(bhpath+i+'_wmet.tsv', sep='\t', header=4)
    

    If you want to join a list of DataFrames like this you could use concat:

    trackers = (pd.read_csv(bhpath+i+'_wmet.tsv', sep='\t', header=4) for i in range(?))
    df = pd.concat(trackers)
    

    * which I think is deprecated.