Search code examples
regexperl

Searching Perl array with regex and return single capturing group only


I have a Perl script in which I perform web service calls in a loop. The server returns a multivalued HTTP header that I need to parse after each call with information that I will need to make the next call (if it doesn't return the header, I want to exit the loop).

I only care about one of the values in the header, and I need to get the information out of it with a regular expression. Let's say the header is like this, and I only care about the "foo" value:

X-Header: test-abc12345; blah=foo
X-Header: test-fgasjhgakg; blah=bar

I can get the header values like this: @values = $response->header( 'X-Header' );. But how do I quickly check if

  1. There is a foo value, and
  2. Parse and save the foo value for the next iteration?

Ideally, I'd like to do something like this:

my $value = 'default';

do {
  # (do HTTP request; use $value)
  @values = $response->header( 'X-Header' );
} while( $value = first { /(?:test-)([^;]+)(?:; blah=foo)/ } @values );

But grep, first (from List::Util), etc. all return the entire match and not just the single capturing group I want. I want to avoid cluttering up my code by looping over the array and matching/parsing inside the loop body.

Is what I want possible? What would be the most compact way to write it? So far, all I can come up with is using lookarounds and \K to discard the stuff I don't care about, but this isn't super readable and makes the regex engine perform a lot of unnecessary steps.


Solution

  • So it seems that you want to catch the first element with a certain pattern, but acquire only the pattern. And you want it done nicely. Indeed, first and grep only pass the element itself.

    However, List::MoreUtils::first_result does support processing of its match

    use List::MoreUtils 0.406 qw(first_result);
    
    my @w = qw(a bit c dIT);  # get first "it" case-insensitive
    
    my $res = first_result { ( /(it)/i )[0] } @w;
    
    say $res // 'undef';  #--> it
    

    That ( ... )[0] is needed to put the regex in the list context so that it returns the actual capture. Another way would be firstres { my ($r) = /(it)/i; $r }. Pick your choice


    For the data in the question

    use warnings;
    use strict;
    use feature 'say';
    
    use List::MoreUtils 0.406 qw(firstres);
    
    my @data = ( 
        'X-Header: test-abc12345; blah=foo',
        'X-Header: test-fgasjhgakg; blah=bar'
    );
    
    if (my $r = firstres { ( /test-([^;]+);\s+blah=foo/ )[0] } @data) {
        say $r
    }
    

    Prints abc12345, clarified in a comment to be the sought result.


    Module versions prior to 0.406 (of 2015-03-03) didn't have firstres (alias first_result)