Search code examples
pythonreadfile

Read data from text file from random directory in Python


I want to read data (numbers) from a text file that has a random directory. The text file contains both words and numbers that looks like this, how can I extract these columns.

Start Time:  7/28/2019 7:58:06 PM         Time Completed:  7/28/2019 8:21:24 PM     Elapsed Time:  00:23:17
Sample ID:     190728-MTJ-IP

***DATA***

    Field(Oe)    Moment(emu)    
     987.95878   0.000046470297     
     963.27719   0.000046452876     
     938.57541   0.000046659299     
     913.89473   0.000046416303     
     889.19093   0.000046813005     
     864.50576   0.000047033128     
     839.80973   0.000046368291     
     815.12703   0.000046888714     
     790.45031   0.000045933749     
     765.75385   0.00004716459  
     741.05444   0.000046405491 

I intend to use this but I am confused, what indexes I should put on:

def txtread(filepath):
 data = []
 with open(filepath+'.txt', 'r') as readfile:
      datalines = readfile.readlines()
      for lines in datalines:
            temp = lines.strip('\t\n').split(',')
            temp = np.array(temp[:],dtype=float)
 data = np.array(data[0::2])
 H = data[:,0]
 M = data[:,1]

Solution

  • Pandas read_csv method has a bunch of parameters to handle all of these:

    >>> import pandas as pd
    >>> pd.read_csv('temp.txt', skiprows=5, delim_whitespace=True)                                                    
    
        Field(Oe)  Moment(emu)
    0   987.95878     0.000046
    1   963.27719     0.000046
    2   938.57541     0.000047
    3   913.89473     0.000046
    4   889.19093     0.000047
    5   864.50576     0.000047
    6   839.80973     0.000046
    7   815.12703     0.000047
    8   790.45031     0.000046
    9   765.75385     0.000047
    10  741.05444     0.000046
    

    The output of pd.read_csv is a DataFrame. if you prefer to work with numpy arrays,

    df = pd.read_csv(...)
    np_data = df.values