I have a text file with a data set of the form
Line 1
Line 2
!
1.01499999 0.504999995 6.19969398E-7 5.38933136E-7 1.35450875E-6
1.74000001 0.220000029 7.92876381E-6 4.1831604E-6 6.61433387E-6
2.10750008 0.147500038 2.06282803E-5 9.86384475E-6 9.99511121E-6
2.54500008 0.289999962 1.9321451E-5 9.41255712E-6 1.40418542E-5
3.19000006 0.355000019 2.1970769E-5 1.08561271E-5 1.98473426E-5
4.18249989 0.637500048 1.94816221E-5 1.02610993E-5 2.58007622E-5
6.23999977 1.41999984 1.95048287E-5 1.50751257E-5 2.59193912E-5
8.50250053 0.84250021 -3.13448794E-7 7.45566576E-5 1.58940475E-5
-------
1.02499998 0.389999986 1.25625479E-6 8.79037373E-7 1.47827166E-6
1.57249999 0.157500029 3.83437873E-6 4.57433907E-6 4.91447827E-6
2.16750002 0.4375 9.80125606E-6 4.12876625E-6 9.34428499E-6
3.00500011 0.399999976 2.13497806E-5 9.10624203E-6 1.67928665E-5
4.25500011 0.850000024 1.35475839E-5 8.33513332E-6 2.37308996E-5
5.84749985 0.742500067 4.97072215E-5 2.55404848E-5 2.50197209E-5
8.25749969 1.66750002 1.56722888E-6 4.85851124E-5 1.65847723E-5
-------
0.741162002 0.158674002 1.99696819E-6 8.0933961E-7 8.04328749E-7
1.11103797 0.211201996 2.83219379E-6 1.18746482E-6 2.33293395E-6
1.48358989 0.161349952 6.8232016E-6 2.48437755E-6 4.83310259E-6
2.01257992 0.367640018 4.58416616E-6 2.24343057E-6 9.16807585E-6
2.67011499 0.289895058 2.16971075E-5 7.58860506E-6 1.58778421E-5
3.64500499 0.684994936 1.36258495E-5 5.31197475E-6 2.35662919E-5
5.09749985 0.767499924 1.92129683E-5 9.40257451E-6 2.89442723E-5
6.32749987 0.462500095 7.3068717E-5 2.60800098E-5 2.67253181E-5
8.11250019 1.32250023 3.10791838E-5 1.88406557E-5 1.89492275E-5
-------
I want to read it as a pandas dataframe to assign each data set separated by the string -------
so that I can read it in a fashion similar to this:
names = ['column_name1','column_name2','column_name3','column_name4','column_name5']
for j in range(5):
names.append('cols%i' %j)
kwargs = {
"names":names,
"delimiter":' ',
"skip_blank_lines":True
}
skiprows=3
with open('file.txt') as f:
content = "".join(f.readlines()[skiprows:])
dfs = {
k: pd.read_table(io.StringIO(data), **kwargs)
for k, data in enumerate(content.split(u'-------')) #dfs.values()[i] will be a separated dataset discriminated by the string `-------`
}
axs1.errorbar(dfs.values()[0].column_name1, dfs.values()[0].column_name2, xerr=dfs.values()[0].column_name3, yerr=dfs.values()[0].column_name4, fmt='.')
The above code returns the error TypeError: 'dict_values' object is not subscriptable
.
How can I read the data frame dfs
splitting it based on the string -------
in order that I can plot its values in a way similar to that used in the code with ax1.errorbar()
, e.g., using dfs.values()[0].column_name1
?
After assigning content
, you can do something like this:
dfs = {}
for i, g in enumerate(content.split("-------")):
df = pd.read_csv(io.StringIO(g), **kwargs)
if not df.empty:
dfs[i] = df
Then this:
for i, df in dfs.items():
print(f"{i = }")
print(df)
print()
will print:
i = 0
column_name1 column_name2 column_name3 column_name4 column_name5 cols0 cols1 cols2 cols3 cols4
0 1.015000 0.5050 6.199694e-07 5.389331e-07 0.000001 NaN NaN NaN NaN NaN
1 1.740000 0.2200 7.928764e-06 4.183160e-06 0.000007 NaN NaN NaN NaN NaN
2 2.107500 0.1475 2.062828e-05 9.863845e-06 0.000010 NaN NaN NaN NaN NaN
3 2.545000 0.2900 1.932145e-05 9.412557e-06 0.000014 NaN NaN NaN NaN NaN
4 3.190000 0.3550 2.197077e-05 1.085613e-05 0.000020 NaN NaN NaN NaN NaN
5 4.182500 0.6375 1.948162e-05 1.026110e-05 0.000026 NaN NaN NaN NaN NaN
6 6.240000 1.4200 1.950483e-05 1.507513e-05 0.000026 NaN NaN NaN NaN NaN
7 8.502501 0.8425 -3.134488e-07 7.455666e-05 0.000016 NaN NaN NaN NaN NaN
i = 1
column_name1 column_name2 column_name3 column_name4 column_name5 cols0 cols1 cols2 cols3 cols4
0 1.0250 0.3900 0.000001 8.790374e-07 0.000001 NaN NaN NaN NaN NaN
1 1.5725 0.1575 0.000004 4.574339e-06 0.000005 NaN NaN NaN NaN NaN
2 2.1675 0.4375 0.000010 4.128766e-06 0.000009 NaN NaN NaN NaN NaN
3 3.0050 0.4000 0.000021 9.106242e-06 0.000017 NaN NaN NaN NaN NaN
4 4.2550 0.8500 0.000014 8.335133e-06 0.000024 NaN NaN NaN NaN NaN
5 5.8475 0.7425 0.000050 2.554048e-05 0.000025 NaN NaN NaN NaN NaN
6 8.2575 1.6675 0.000002 4.858511e-05 0.000017 NaN NaN NaN NaN NaN
i = 2
column_name1 column_name2 column_name3 column_name4 column_name5 cols0 cols1 cols2 cols3 cols4
0 0.741162 0.158674 0.000002 8.093396e-07 8.043287e-07 NaN NaN NaN NaN NaN
1 1.111038 0.211202 0.000003 1.187465e-06 2.332934e-06 NaN NaN NaN NaN NaN
2 1.483590 0.161350 0.000007 2.484378e-06 4.833103e-06 NaN NaN NaN NaN NaN
3 2.012580 0.367640 0.000005 2.243431e-06 9.168076e-06 NaN NaN NaN NaN NaN
4 2.670115 0.289895 0.000022 7.588605e-06 1.587784e-05 NaN NaN NaN NaN NaN
5 3.645005 0.684995 0.000014 5.311975e-06 2.356629e-05 NaN NaN NaN NaN NaN
6 5.097500 0.767500 0.000019 9.402575e-06 2.894427e-05 NaN NaN NaN NaN NaN
7 6.327500 0.462500 0.000073 2.608001e-05 2.672532e-05 NaN NaN NaN NaN NaN
8 8.112500 1.322500 0.000031 1.884066e-05 1.894923e-05 NaN NaN NaN NaN NaN