Search code examples
perl

Simple parse not printing out date in string


I'm not sure why my parse isn't working on this string. I got it to work by putting the ? after the .*, but I don't understand why the 1st parse isn't working. Here is the string and the 1st and 2nd parses.

my $string = "/usr/local/bin/python3.9 -u /usr/local/bin/scripts/master_program.py 4002 daily true 20230421";
my $pythonpgm = "master_program.py";
my ($midmark1) = $string =~ /.*$pythonpgm \d+ .*(\d+).*/;
my ($midmark2) = $string =~ /.*$pythonpgm \d+ .*?(\d+).*/;

print "\nmidmark1 =>  $midmark1\n";
print "\nmidmark2 =>  $midmark2\n\n";

This is the printout:

midmark1 =>  1

midmark2 =>  20230421

Solution

  • The .* matches everything it can. It would match up until the end of the string, but then the \d+ wouldn't match.

    ...master_program.py 4002 daily true 20230421
    \_/\_______________/|\__/|\_________________/         XXX Wrong
     |         |        | |  |        |          X
     |         |        |  \  \       |          |
     .*    $pythonpgm  [ ] \d+ [ ]    .*       (\d+) .*
    

    So it gives up one character to allow \d+ to match.

    ...master_program.py 4002 daily true 20230421
    \_/\_______________/|\__/|\________________/||
     |         |        | |  |        |         | \
     |         |        |  \  \       |         |  \
     .*    $pythonpgm  [ ] \d+ [ ]    .*      (\d+) .*
    

    I'd use

    / \Q$pythonpgm\E [ ] \d+ [ ] .* [ ] (\d+) \z /xs
    

    or just

    / \Q$pythonpgm\E [ ] .* [ ] (\d+) \z /xs
    

    The space before the date acts as an anchor, forcing .* to backtrack further.