Search code examples
regexperl

Perl regex - print only modified line (like sed -n 's///p')


I have a command that outputs text in the following format:

misc1=poiuyt
var1=qwerty
var2=asdfgh
var3=zxcvbn
misc2=lkjhgf

etc. I need to get the values for var1, var2, and var3 into variables in a perl script.

If I were writing a shell script, I'd do this:

OUTPUT=$(command | grep '^var-')
VAR1=$(echo "${OUTPUT}" | sed -ne 's/^var1=\(.*\)$/\1/p')
VAR2=$(echo "${OUTPUT}" | sed -ne 's/^var2=\(.*\)$/\1/p')
VAR3=$(echo "${OUTPUT}" | sed -ne 's/^var3=\(.*\)$/\1/p')

That populates OUTPUT with the basic content that I want (so I don't have to run the original command multiple times), and then I can pull out each value using sed VAR1 = 'qwerty', etc.

I've worked with perl in the past, but I'm pretty rusty. Here's the best I've been able to come up with:

my $output = `command | grep '^var'`;
(my $var1 = $output) =~ s/\bvar1=(.*)\b/$1/m;
print $var1

This correctly matches and references the value for var1, but it also returns the unmatched lines, so $var1 equals this:

qwerty
var2=asdfgh
var3=zxcvbn

With sed I'm able to tell it to print only the modified lines. Is there a way to do something similar with in perl? I can't find the equivalent of sed's p modifier in perl.

Conversely, is there a better way to extract those substrings from each line? I'm sure I could match match each line and split the contents or something like that, but was trying to stick with regex since that's how I'd typically solve this outside of perl.

Appreciate any guidance. I'm sure I'm missing something relatively simple.


Solution

  • One way

    my @values = map { /\bvar(?:1|2|3)\s*=\s*(.*)/ ? $1 : () } qx(command);
    

    The qx operator ("backticks") returns a list of all lines of output when used in list context, here imposed by map. (In a scalar context it returns all output in a string, possibly multiline.) Then map extracts wanted values: the ternary operator in it returns the capture, or an empty list when there is no match (so filtering out such lines). Please adjust the regex as suitable.

    Or one can break this up, taking all output, then filtering needed lines, then parsing them. That allows for more nuanced, staged processing. And then there are libraries for managing external commands that make more involved work much nicer.


    A comment on the Perl attempt shown in the question

    Since the backticks is assigned to a scalar it is in scalar context and thus returns all output in a string, here multiline. Then the following regex, which replaces var1=(.*) with $1, leaves the next two lines since . does not match a newline so .* stops at the first newline character.

    So you'd need to amend that regex to match all the rest so to replace it all with the capture $1. But then for other variables the pattern would have to be different. Or, could replace the input string with all three var-values, but then you'd have a string with those three values in it.

    So altogether: using the substitution here (s///) isn't suitable -- just use matching, m//.

    Since in list context the match operator also returns all matches another way is

    my @values = qx(command) =~ /\bvar(?:1|2|3)\s*=\s*(.*)/g;
    

    Now being bound to a regex, qx is in scalar context and so it returns a (here multiline) string, which is then matched by regex. With /g modifier the pattern keeps being matched through that string, capturing all wanted values (and nothing else). The fact that . doesn't match a newline so .* stops at the first newline character is now useful.

    Again, please adjust the regex as suitable to yoru real problem.


    Another need came up, to capture both the actual names of variables and their values. Then add capturing parens around names, and assign to a hash

    my %val = map { /\b(var(?:1|2|3))\s*=\s*(.*)/ ? ($1, $2) : () } qx(command);
    

    or

    my %val = qx(command) =~ /\b(var(?:1|2|3))\s*=\s*(.*)/g;
    

    Now the map for each line of output from command returns a pair of var-name + value, and a list of such pairs can be assigned to a hash. The same goes with subsequent matches (under /g) in the second case..