I want to read data (numbers) from a text file that has a random directory. The text file contains both words and numbers that looks like this, how can I extract these columns.
Start Time: 7/28/2019 7:58:06 PM Time Completed: 7/28/2019 8:21:24 PM Elapsed Time: 00:23:17
Sample ID: 190728-MTJ-IP
***DATA***
Field(Oe) Moment(emu)
987.95878 0.000046470297
963.27719 0.000046452876
938.57541 0.000046659299
913.89473 0.000046416303
889.19093 0.000046813005
864.50576 0.000047033128
839.80973 0.000046368291
815.12703 0.000046888714
790.45031 0.000045933749
765.75385 0.00004716459
741.05444 0.000046405491
I intend to use this but I am confused, what indexes I should put on:
def txtread(filepath):
data = []
with open(filepath+'.txt', 'r') as readfile:
datalines = readfile.readlines()
for lines in datalines:
temp = lines.strip('\t\n').split(',')
temp = np.array(temp[:],dtype=float)
data = np.array(data[0::2])
H = data[:,0]
M = data[:,1]
Pandas read_csv method has a bunch of parameters to handle all of these:
>>> import pandas as pd
>>> pd.read_csv('temp.txt', skiprows=5, delim_whitespace=True)
Field(Oe) Moment(emu)
0 987.95878 0.000046
1 963.27719 0.000046
2 938.57541 0.000047
3 913.89473 0.000046
4 889.19093 0.000047
5 864.50576 0.000047
6 839.80973 0.000046
7 815.12703 0.000047
8 790.45031 0.000046
9 765.75385 0.000047
10 741.05444 0.000046
The output of pd.read_csv
is a DataFrame
. if you prefer to work with numpy arrays,
df = pd.read_csv(...)
np_data = df.values