I have a huge file and I need to extract specific rows and columns, then save them in an output file. I have around 1000 file so I want to do the same for all the files then I will have 1000 new files containing the data I want. I am really beginner in python and I find difficulties to do it.
I have tried read the file and save all lines in a list, but i could't do more.
Cycle 3 Down - 20_3.2_10_100_1
units of measure: atoms / barn-cm
time (years)
nuclide 1.000E-02 3.000E-02 1.000E-01 3.000E-01 1.000E+00 3.000E+00 1.000E+01
-------- --------- --------- --------- --------- --------- --------- ---------
ag109 9.917E-07 9.917E-07 9.917E-07 9.917E-07 9.917E-07 9.917E-07 9.917E-07
am241 1.301E-07 1.389E-07 1.695E-07 2.565E-07 5.540E-07 1.349E-06 3.577E-06
am243 8.760E-08 8.760E-08 8.760E-08 8.760E-08 8.759E-08 8.757E-08 8.752E-08
cs133 2.083E-05 2.101E-05 2.112E-05 2.112E-05 2.112E-05 2.112E-05 2.112E-05
eu151 4.979E-10 5.579E-10 7.679E-10 1.367E-09 3.458E-09 9.368E-09 2.935E-08
eu153 1.128E-06 1.132E-06 1.132E-06 1.132E-06 1.132E-06 1.132E-06 1.132E-06
gd155 4.398E-10 5.831E-10 1.081E-09 2.477E-09 7.048E-09 1.778E-08 3.786E-08
mo95 1.317E-05 1.351E-05 1.466E-05 1.716E-05 1.960E-05 1.979E-05 1.979E-05
nd143 1.563E-05 1.587E-05 1.626E-05 1.641E-05 1.641E-05 1.641E-05 1.641E-05
nd145 1.181E-05 1.181E-05 1.181E-05 1.181E-05 1.181E-05 1.181E-05 1.181E-05
np237 2.898E-06 2.944E-06 2.982E-06 2.985E-06 2.986E-06 2.989E-06 3.017E-06
This is the part of the file that I want to save. I want to save the nuclide name and the last column values.
nuclide=[]
with open ('filename.txt','r') as myfile:
for line in myfile:
nuclide.append(line)
print(nuclide[4900]).find("ag109"))
I should have a list containing a nuclide symbol with the last column value
If you want to read the data you show, extract just the data lines, parse them to extract the first and last columns, and then write just the first and last columns to a file, here's how you can do that:
import re
with open("/tmp/input.txt") as ifh:
with open("/tmp/output.txt", "w") as ofh:
while True:
line = ifh.readline()
if not line:
break
columns = re.split(r"\s+", line.strip())
if len(columns) == 8 and columns[0] != 'nuclide' and columns[0][0] != '-':
ofh.write("{} {}\n".format(columns[0], columns[7]))
I tried to be super forgiving of exactly what was at the beginning of the file. Here's the output from taking the data you give in your question, pasting into a file as is with all the stuff at the top, and then running this program over it.
/tmp/output.txt:
ag109 9.917E-07
am241 3.577E-06
am243 8.752E-08
cs133 2.112E-05
eu151 2.935E-08
eu153 1.132E-06
gd155 3.786E-08
mo95 1.979E-05
nd143 1.641E-05
nd145 1.181E-05
np237 3.017E-06
This should be able to handle very large files, as I don't read the whole file into memory, but rather read and write lines one by one.