I have output from a Quantum Chemistry program from which I wish to extract tabular data for input into a Python port of a FORTRAN program I wrote about 25 years ago.
Some of the output files are rather long, as many as 6000 lines which precludes the use of a spreadsheet for processing.
A typical table is of the form:
CARTESIAN COORDINATES
1 C 0.011987266 -0.003842185 0.006578784
2 H 1.097152909 -0.003956163 0.013339310
3 H -0.349612312 1.019316731 0.001903075
4 H -0.344276148 -0.517463019 -0.880495291
5 H -0.355315644 -0.513266496 0.891567896
I'm not asking for someone to write the Python code for me, but rather give me some guidance thorough the labyrinth of available code.
I suggest you look into np.genfromtxt.
The following code snippet will read the example data from your question stored in a file called data.txt
.
import numpy as np
data = np.genfromtxt('data.txt', skip_header=2, dtype=[('id', 'i8'),('label','S1'),('x','f8'),('y','f8'),('z','f8')])
print(data)
Output
[(1, b'C', 0.01198727, -0.00384219, 0.00657878)
(2, b'H', 1.09715291, -0.00395616, 0.01333931)
(3, b'H', -0.34961231, 1.01931673, 0.00190307)
(4, b'H', -0.34427615, -0.51746302, -0.88049529)
(5, b'H', -0.35531564, -0.5132665 , 0.8915679 )]