Search code examples
pythonarraysnumpygeneratoryield

Convert blocks into arrays


I have a text file containing data in form of blocks. Something like:

File description
Used units
Additional info
T[K]
50 75 100
125 150 175
200 225 250
Field_1
0.1 0.2 0.3
0.4 0.5 0.6
0.7 0.8 0.9
Field_2
1.0 2.0 3.0
4.0 5.0 6.0
7.0 8.0 9.0

I need to skip the lines without data, and read and convert the three blocks with data into three arrays. Ideally, I want to use a generator that can identify the lines with T[K], Field_1, Field_2, and separately collect whatever is in the following block of three lines.

Something that starts like this:

def npgenfromtxtgenerator(file_name):
    with open(file_name) as fp:
        for line_no, line in enumerate(fp):
            if line.startswith('Te[eV]'):
                # Make first array
            if line.startswith('Field_1'):
                # Make second array
            if line.startswith('Field_2'):
                # Make third array

Many thanks


Solution

  • Try:

    s = """\
    File description
    Used units
    Additional info
    T[K]
    50 75 100
    125 150 175
    200 225 250
    Field_1
    0.1 0.2 0.3
    0.4 0.5 0.6
    0.7 0.8 0.9
    Field_2
    1.0 2.0 3.0
    4.0 5.0 6.0
    7.0 8.0 9.0"""
    
    import re
    import numpy as np
    from io import StringIO
    
    fields = ["T[K]", "Field_1", "Field_2"]
    pat = "|".join(map(re.escape, fields))
    pat = re.compile(fr"^({pat})([\s\d.-]+)", flags=re.M | re.S)
    
    out = {n: np.loadtxt(StringIO(a)) for n, a in pat.findall(s)}
    
    # pretty print the dictionary:
    for k, v in out.items():
        print(k)
        print(v)
        print("-" * 80)
    

    Prints:

    T[K]
    [[ 50.  75. 100.]
     [125. 150. 175.]
     [200. 225. 250.]]
    --------------------------------------------------------------------------------
    Field_1
    [[0.1 0.2 0.3]
     [0.4 0.5 0.6]
     [0.7 0.8 0.9]]
    --------------------------------------------------------------------------------
    Field_2
    [[1. 2. 3.]
     [4. 5. 6.]
     [7. 8. 9.]]
    --------------------------------------------------------------------------------