Search code examples
pythonregexpython-2.7openfoamfreecad

Python-Regular expressions: extract a list of tuples after a keyword from a text file


I want to implement a simplified version of what I have suggested here to import some vertices from an OpenFOAM blockMeshDict file and then visualize them with FreeCAD.

the part of the file I'm interested in is a list of tuples (xi yi zi)s of floats, between parentheses after the vertices keyword. the file looks like this:

vertices
(
    (1 2 3)
    (3 4 5)
    ...
)

I'm able to read the file from the same folder as the python script with:

import os
os.chdir(os.path.dirname(__file__))
with open("blockMeshDict", "r") as f:
    s=f.read()

But then when I try to extract the content between the parentheses after the vertices with:

import re
r1=re.search(r'vertices\n\((.*?)\)', s)
print r1.group(1)

I get the error:

type 'exceptions.IndexError: no such group

and I don't know how to solve it. What I want to have in the end is a list of tuples like [(x1,y1,z1),(x2,y2,z2)...] I would appreciate it if you could help me know how I can implement this in Python 2.7.

P.S. A summary of this effort can be found in this GitHub Gist


Solution

  • This will be the main regex to find the outter structure: \bvertices\s*\((\s*(?:\([^)]+\)\s*)+)\)

    Before that, we will remove all the comments.

    And then an extra regex to extract all content inside the vertices structure: \([^)]+\)

    See demo here.

    The code:

    import re
    
    test_str = """
    /*--------------------------------*- C++ -*----------------------------------*\
    | =========                 |                                                 |
    | \\      /  F ield         | OpenFOAM: The Open Source CFD Toolbox           |
    |  \\    /   O peration     | Version:  5                                     |
    |   \\  /    A nd           | Web:      www.OpenFOAM.org                      |
    |    \\/     M anipulation  |                                                 |
    \*---------------------------------------------------------------------------*/
    FoamFile
    {
        version     2.0;
        format      ascii;
        class       dictionary;
        object      blockMeshDict;
    }
    // * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * //
    
    convertToMeters 0.001;
    
    vertices
    (
        (-20.6 0 -0.5)
        (-20.6 25.4 -0.5)  /* Some comment */
        (0 -25.4 -0.5)
        (0 0 -0.5)
        (0 25.4 -0.5)
        (206 -25.4 -0.5)
        (206 0 -0.5)
        (206 25.4 -0.5)
        (290 -16.6 -0.5)
        (290 0 -0.5)
        (290 16.6 -0.5)
    
        (-20.6 0 0.5)
        (-20.6 25.4 0.5)
        (0 -25.4 0.5)
        (0 0 0.5)
        (0 25.4 0.5)
        (206 -25.4 0.5)
        (206 0 0.5)
        (206 25.4 0.5)
        (290 -16.6 0.5)
        (290 0 0.5)
        (290 16.6 0.5)
      /*(1 2 3 4)*/ // Commented tuple
      //(1 2 3 4)
    );
    
    /* vertices commented
    vertices
    (
        (-20.6 0 -0.5)
        (-20.6 25.4 -0.5)
        (0 -25.4 -0.5)
        (0 0 -0.5)
        (0 25.4 -0.5)
        (206 -25.4 -0.5)
        (206 0 -0.5)
        (206 25.4 -0.5)
        (290 -16.6 -0.5)
        (290 0 -0.5)
        (290 16.6 -0.5)
    )
    */
    
    negY
    (
        (2 4 1)
        (1 3 0.3)
    );
    
    posY
    (
        (1 4 2)
        (2 3 4)
        (2 4 0.25)
    );
    
    posYR
    (
        (2 1 1)
        (1 1 0.25)
    );
    
    
    blocks
    (
        hex (0 3 4 1 11 14 15 12)
        (18 30 1)
        simpleGrading (0.5 $posY 1)
    
        hex (2 5 6 3 13 16 17 14)
        (180 27 1)
        edgeGrading (4 4 4 4 $negY 1 1 $negY 1 1 1 1)
    
        hex (3 6 7 4 14 17 18 15)
        (180 30 1)
        edgeGrading (4 4 4 4 $posY $posYR $posYR $posY 1 1 1 1)
    
        hex (5 8 9 6 16 19 20 17)
        (25 27 1)
        simpleGrading (2.5 1 1)
    
        hex (6 9 10 7 17 20 21 18)
        (25 30 1)
        simpleGrading (2.5 $posYR 1)
    );
    
    edges
    (
    );
    
    boundary
    (
        inlet
        {
            type patch;
            faces
            (
                (0 1 12 11)
            );
        }
        outlet
        {
            type patch;
            faces
            (
                (8 9 20 19)
                (9 10 21 20)
            );
        }
        upperWall
        {
            type wall;
            faces
            (
                (1 4 15 12)
                (4 7 18 15)
                (7 10 21 18)
            );
        }
        lowerWall
        {
            type wall;
            faces
            (
                (0 3 14 11)
                (3 2 13 14)
                (2 5 16 13)
                (5 8 19 16)
            );
        }
        frontAndBack
        {
            type empty;
            faces
            (
                (0 3 4 1)
                (2 5 6 3)
                (3 6 7 4)
                (5 8 9 6)
                (6 9 10 7)
                (11 14 15 12)
                (13 16 17 14)
                (14 17 18 15)
                (16 19 20 17)
                (17 20 21 18)
            );
        }
    );
    // ************************************************************************* //
    """
    
    # Clean comments:
    test_str = re.sub(r"//.*", '', test_str)
    test_str = re.sub(r"/\*.*?\*/", '', test_str, 0, re.DOTALL)
    
    # Match main group
    matches = re.findall(r"\bvertices\s*\((\s*(?:\([^)]+\)\s*)+)\)", test_str, re.MULTILINE | re.DOTALL)
    
    # Fetch tuples
    matches2 = re.findall(r"\([^)]+\)", matches[0], re.MULTILINE | re.DOTALL)
    print matches2
    

    Explained:

    \b        # word boundary
    vertices  # literal 'vertices'
    \s*       # 0 or more spaces (includes line feed/carriage return)
    \(        # literal '('
      (       # First capturing group
        \s*   # Som spaces
        (?:   # Group
           \([^)]+\)   # literal '(' + any non-')' character 1 or more times + literal ')'
           \s*         # extra spaces
        )+    # repeated one or more times
      )
    \)        # literal ')'
    

    Then you get that captured group and search for \([^)]+\). That will find instances of vertices.