Search code examples
regexperlawkbackslasheol

Multi platform script perl or awk


I am trying to match records in following format:

(-,username,domain1.co.uk)\
(-,username,domain2.co.uk)

either awk or perl must be used. I am using cygwin and wrote following code which works and matches both above entries:

awk 'BEGIN {musr="(-,username,[^)]+.co.uk)"} {if ($0~musr) print $0}' netgroup

But if I try to modify this regexp to be more specific the output is nothing:

1st: match record then last backslash and then match newline:

"(-,username,[^)]+.co.uk)\\$"

2nd: match new line immediatelly after record without backslash:

"(-,username,[^)]+.co.uk)$"

So I decided to rewrite script into perl, hoping that perl can deal with backslashes and end of line symbols. For this purpose I used a2p this way:

echo  'BEGIN {musr="(-,username,[^)]+.co.uk)"} {if ($0~musr) print $0}' | a2p.exe 
#!/usr/bin/perl
eval 'exec /usr/bin/perl -S $0 ${1+"$@"}'
    if $running_under_some_shell;
                        # this emulates #! processing on NIH machines.
                        # (remove #! line above if indigestible)

eval '$'.$1.'$2;' while $ARGV[0] =~ /^([A-Za-z_0-9]+=)(.*)/ && shift;
                        # process any FOO=bar switches

$, = ' ';               # set output field separator
$\ = "\n";              # set output record separator

$musr = '(-,username,[^)]+.co.uk)';

while (<>) {
    chomp;      # strip record separator
    if ($_ =~ $musr) {
        print $_;
    }
}

This generated perl script also matches both entries, however if I try modify this script to more specific I get the following errors:

1st:

$musr = "(-,username,[^)]+.co.uk)\\";
Trailing \ in regex m/(-,username,[^)]+.co.uk)\/ at perlmatch.pl line 18, <> line 1.

2nd:

$musr = "(-,username,[^)]+.co.uk)$";
Final $ should be \$ or $name at perlmatch.pl line 14, within string
syntax error at perlmatch.pl line 14, near "= "(-,username,[^)]+.co.uk)$""
Execution of perlmatch.pl aborted due to compilation errors.

3rd:

$musr = "(-,username,[^)]+.co.uk)\$";
[the output is nothing]

What I am doing wrong ? My question is also pointing to fact that if somebody needs to use script on several platforms (aix, solaris, linux) than using perl should be better approach that dealing with (non)GNU utils and various (g|n)awk versions etc. Regards


Solution

  • Your problems arise from string quoting in Perl.

    $musr = "(-,username,[^)]+.co.uk)\\"; replaces \\ with a single backslash when the string is created. But you would need to pass two backslashes to the regex. So you would have to put four in when you create the string.

    $musr = "(-,username,[^)]+.co.uk)$"; tries to perform variable interpolation within the string.

    In addition, parentheses should be escaped, as John Kugelman noted.

    The solution is to use Perl's built-in delimiters for regular expressions, rather than normal quoted strings. The simple way is to put it right into your loop:

    while (<>) {
        chomp;      # strip record separator
        if ($_ =~ /\(-,username,[^)]+.co.uk\)$/) {
            print $_;
        }
    }
    

    If you do need to put the pattern into a variable first, use the special qr// operator.

    my $musr = qr/\(-,username,[^)]+.co.uk\)$/;
    while (<>) {
        chomp;      # strip record separator
        if ($_ =~ $musr) {
            print $_;
        }
    }