Search code examples
regexcobolregex-lookaroundsadobe-brackets

Regex that identifies sections in COBOL


I'm customising an outline plugin for Brackets that uses regex to identify the outline of the currently opened file.

Using regex101.com I've created the following regex (uses lookarounds to determine that the line starts with seven spaces and ends with ' SECTION.'):

(?<=^       )([A-Za-z\-0-9]*)(?= SECTION\.[ ]*$)

According to regex101.com it is okay, however when validated via jshint/jslint it states that it's invalid. When I test it, it doesn't work (I suspect that JSHint/JSLint is correct).

The following is an example of some cobol code where I wish to get 2000-GET-EXPECTED-BY-DATE and 2020-GET-DUE-DATE.

          ...
      2000-GET-EXPECTED-BY-DATE SECTION.
          MOVE '2' TO W10-OPTION.

          ...

          ELSE                                                     
              MOVE 'Y' TO W10-NO-ERRORS                         
          END-IF.                                                  

      2017-EXIT.                                                   
          EXIT.                                                   
     /
      2020-GET-DUE-DATE SECTION.
      2020.

          MOVE 'N' TO W10-USER-INPUT-DUE-DATE-SW.
          MOVE '1' TO W10-OPTION.
          ...

So my questions are:

  • Is the regex is valid?
  • If invalid, then what have I done wrong?
  • How should I write the regex to find the name of each section?

Solution

  • This works for me to find the lines with "SECTION":

    ^[ ]{7}(.*)[ ]SECTION\.$
    

    DEMO: http://regex101.com/r/zC1xY6/2

    If you only want the section names: ^[ ]{7}\d+\-(.*)[ ]SECTION\.$