Search code examples
regexcapturing-group

regex to get comma seperated list between the word uses and a semicolon in a file


As title says, I need to find all imports in a delphi file. The text looks a bit like this:

uses X.Y.Z, A.B, C.D.F;  class procedure 

So my regex matches would be:

  • X.Y.Z
  • A.B
  • C.D.F

I know I need to use capture groups for the X.Y.Z. but I can only manage to get the first group. Between each capture group there can be either spaces or 1 or more newlines or both. Here is what I have so far: ^uses(?:[\n|\s]*([a-z|.|A-Z]+)(?:,)+)


Solution

  • I think you should take a 2-step approach:

    Start from searching of the whole source file, capturing strings between:

    • uses (at the start of line) + a sequence of spaces,
    • and a ;, terminating the import list.

    Then, within each of the above matches, find imported items.

    The first task can be performed with ^uses\s+([a-z.,\s]+); and the text to process in the next step is the content of capturing group No 1.

    In the second step, performed on each of the above matches, you can use [a-z]+(?:\.[a-z]+)*(?=[,;]).

    Both of the above regexes should be performed with i (case insensitive), m (multi-line) and g (global) options.

    Note that a single-regex approach is wrong, because if a capturing group has been matched multiple times, then the regex processor returns only the last match for such a group.