Search code examples
regexperl

capturing multiple instances of a pattern


I have a string:

{value1}+{value2}-{value3}*{value...n}

using a regular expression, I want to capture each of the bracketed values as well as the operators in between them and I do not know how many brackets there will be.

I tried:

/(\{.*\}).*([\+|\-|\*|\/])*/mgU

but that is just getting me the values and not the operators. Where did I go wrong?


Solution

  • You can validate the string first with

    /\A ({ [^{}]* }) (?: [\/+*-] (?1))* \z/x
    

    Details:

    • \A - start of string
    • ({[^{}]*}) - Group 1: a {, any zero or more chars other than { and } and then a } char
    • (?:[\/+*-](?1))* - zero or more occurrences of a /, +, * or - char and then the Group 1 pattern
    • \z - end of string.

    Then, you may collect individual matches with

    / { [^{}]* } | [\/+*-] /gx
    

    This regex matches all occurrences of any substrings between { and } (with {[^{}]*}) or /, +, * or - chars (with [\/+*-]).

    See a complete demo script:

    #!/usr/bin/perl
    use strict;
    use warnings;
     
    my $text = "{value1}+{value2}-{value3}*{value...n}";
     
    if ($text =~ /\A ({ [^{}]* }) (?: [\/+*-] (?1))* \z/x) {
        while($text =~ / { [^{}]* } | [\/+*-] /gx) {
            print "$&\n";
        }
    }
    

    Output:

    {value1}
    +
    {value2}
    -
    {value3}
    *
    {value...n}