Search code examples
regexperlbackreference

Perl Regex Multiple Matches


I'm looking for a regular expression that will behave as follows:

input: "hello world."

output: he, el, ll, lo, wo, or, rl, ld

my idea was something along the lines of

    while($string =~ m/(([a-zA-Z])([a-zA-Z]))/g) {
        print "$1-$2 ";
    }

But that does something a little bit different.


Solution

  • It's tricky. You have to capture it, save it, and then force a backtrack.

    You can do that this way:

    use v5.10;   # first release with backtracking control verbs
    
    my $string = "hello, world!";
    my @saved;
    
    my $pat = qr{
        ( \pL {2} )
        (?{ push @saved, $^N })
        (*FAIL)
    }x;
    
    @saved = ();
    $string =~ $pat;
    my $count = @saved;
    printf "Found %d matches: %s.\n", $count, join(", " => @saved);
    

    produces this:

    Found 8 matches: he, el, ll, lo, wo, or, rl, ld.
    

    If you do not have v5.10, or you have a headache, you can use this:

    my $string = "hello, world!";
    my @pairs = $string =~ m{
      # we can only match at positions where the
      # following sneak-ahead assertion is true:
        (?=                 # zero-width look ahead
            (               # begin stealth capture
                \pL {2}     #       save off two letters
            )               # end stealth capture
        )
      # succeed after matching nothing, force reset
    }xg;
    
    my $count = @pairs;
    printf "Found %d matches: %s.\n", $count, join(", " => @pairs);
    

    That produces the same output as before.

    But you might still have a headache.