Search code examples
perlexpect

Expect module: end of string is correctly matched, but characters remain in 'after' part, causing next expect to continue prematurely


I am writing a Perl script with the Expect module, where sometimes expects continues too early, because it matches a prompt of a previous command's output.

To illustrate the problem, I created a simple example script as per below:

#!/usr/bin/perl -w

use Expect;

# disable Expect multiline matching (/m)
$Expect::Multiline_Matching = 0;

my $spawn = new Expect;
my $pattern = qr{[\r\n]+[^\r\n<]+[#>%] ?$};

# silence regular output:
$spawn->log_stdout(0);

# enable internal debug:
$spawn->exp_internal(1);

# spawn process (here: open serial console via picocom):
$spawn->spawn("picocom --baud 9600 --flow n /dev/ttyS8") or die "Cannot spawn picocom: $!\n";

# wait for output of spawned process to settle down:
$spawn->expect(2, [qr{.+}s => sub { exp_continue; } ], );

# send 2x newline ("enter") for demonstration of the problem:
$spawn->send("\n\n");

# wait for $pattern in response of spawned process
$spawn->expect(10, '-re', $pattern);

# cleanup (close picocom)
$spawn->send("\x01\x11");
$spawn->expect(10, '-re', qr/Thanks for using picocom/);
$spawn->hard_close();

Snippet from debug output:

Sending '\n\n' to spawn id(3)
 at /usr/share/perl5/Expect.pm line 1421.
        Expect::print(Expect=GLOB(0x17a5210), "\x{a}\x{a}") called at ./tmp.pl line 24
Starting EXPECT pattern matching...
 at /usr/share/perl5/Expect.pm line 597.
        Expect::expect(Expect=GLOB(0x17a5210), 10, "-re", qr([\r\n]+[^\r\n<]+[#>%] ?$)) called at ./tmp.pl line 27
spawn id(3): list of patterns:
  #1: -re `(?^:[\\r\\n]+[^\\r\\n<]+[#>%] ?$)'


spawn id(3): Does `'
match:
  pattern #1: -re `(?^:[\\r\\n]+[^\\r\\n<]+[#>%] ?$)'? No.


spawn id(3): Does `\r\n\r\n{mas'
match:
  pattern #1: -re `(?^:[\\r\\n]+[^\\r\\n<]+[#>%] ?$)'? No.


spawn id(3): Does `\r\n\r\n{master:0}\r\n'
match:
  pattern #1: -re `(?^:[\\r\\n]+[^\\r\\n<]+[#>%] ?$)'? No.


spawn id(3): Does `\r\n\r\n{master:0}\r\nroot@rou'
match:
  pattern #1: -re `(?^:[\\r\\n]+[^\\r\\n<]+[#>%] ?$)'? No.


spawn id(3): Does `\r\n\r\n{master:0}\r\nroot@router> \r\n\r'
match:
  pattern #1: -re `(?^:[\\r\\n]+[^\\r\\n<]+[#>%] ?$)'? No.


spawn id(3): Does `\r\n\r\n{master:0}\r\nroot@router> \r\n\r\n{master'
match:
  pattern #1: -re `(?^:[\\r\\n]+[^\\r\\n<]+[#>%] ?$)'? No.


spawn id(3): Does `\r\n\r\n{master:0}\r\nroot@router> \r\n\r\n{master:0}\r\nroo'
match:
  pattern #1: -re `(?^:[\\r\\n]+[^\\r\\n<]+[#>%] ?$)'? No.


spawn id(3): Does `\r\n\r\n{master:0}\r\nroot@router> \r\n\r\n{master:0}\r\nroot@router'
match:
  pattern #1: -re `(?^:[\\r\\n]+[^\\r\\n<]+[#>%] ?$)'? No.


spawn id(3): Does `\r\n\r\n{master:0}\r\nroot@router> \r\n\r\n{master:0}\r\nroot@router> '
match:
  pattern #1: -re `(?^:[\\r\\n]+[^\\r\\n<]+[#>%] ?$)'? YES!!
    Before match string: `\r\n\r\n{master:0}'
    Match string: `\r\nroot@router> '
    After match string: `\r\n\r\n{master:0}\r\nroot@router> '
    Matchlist: ()

What would I expect: an empty "After match" string:

After match string:`'

This residue in the actual "After match" string:

After match string:`\\r\\n\\r\\n{master:0}\\r\\nroot@router\> '

would lead to a premature expect match if I do another send/expect call.

Multiline Matching is disabled, which seems to work correctly, otherwise the match would have occured earlier, would it not?

Edit: after reading the first replies to my question: I tend to think that this is not a regexp issue, my pattern works as intended, which is confirmed by the exp_internal output; the "YES!!", indicating the successful match is exactly where I expect it to be, anchored to the end-of-string. It is the matching result that seems weird. Almost like a bug?


Solution

  • I wrote the same example script in native tcl expect, and there it behaves differently (and as intended).

    The tcl expect example script:

    #!/usr/bin/expect -f
    
    exp_internal 1
    log_user 0
    
    set pattern {[\r\n]+[^\r\n<]+[#>%] ?$}
    
    # spawn process (here: open serial console via picocom):
    spawn picocom --baud 9600 --flow n /dev/ttyS8
    
    # wait for output of spawned process to settle down:
    expect -timeout 2 -re {.+} exp_continue
    
    # send 2x newline ("enter"):
    send "\n\n"
    expect -re $pattern
    
    send_user "Anything left in the buffer?\n"
    expect -timeout 2 -re {.+} exp_continue
    
    send "\x01\x11"
    expect "Thanks for using picocom"
    close
    

    Relevant debug output:

    send: sending "\n\n" to { exp4 }
    Gate keeper glob pattern for '[\r\n]+[^\r\n<]+[#>%] ?$' is ''. Not usable, disabling the performance booster.
    
    expect: does "" (spawn_id exp4) match regular expression "[\r\n]+[^\r\n<]+[#>%] ?$"? (No Gate, RE only) gate=yes re=no
    
    expect: does "\r\n\r\n{mas" (spawn_id exp4) match regular expression "[\r\n]+[^\r\n<]+[#>%] ?$"? (No Gate, RE only) gate=yes re=no
    
    expect: does "\r\n\r\n{master:0}\r\n" (spawn_id exp4) match regular expression "[\r\n]+[^\r\n<]+[#>%] ?$"? (No Gate, RE only) gate=yes re=no
    
    expect: does "\r\n\r\n{master:0}\r\nroot@rou" (spawn_id exp4) match regular expression "[\r\n]+[^\r\n<]+[#>%] ?$"? (No Gate, RE only) gate=yes re=no
    
    expect: does "\r\n\r\n{master:0}\r\nroot@router> \r\n\r" (spawn_id exp4) match regular expression "[\r\n]+[^\r\n<]+[#>%] ?$"? (No Gate, RE only) gate=yes re=no
    
    expect: does "\r\n\r\n{master:0}\r\nroot@router> \r\n\r\n{master" (spawn_id exp4) match regular expression "[\r\n]+[^\r\n<]+[#>%] ?$"? (No Gate, RE only) gate=yes re=no
    
    expect: does "\r\n\r\n{master:0}\r\nroot@router> \r\n\r\n{master:0}\r\nroo" (spawn_id exp4) match regular expression "[\r\n]+[^\r\n<]+[#>%] ?$"? (No Gate, RE only) gate=yes re=no
    
    expect: does "\r\n\r\n{master:0}\r\nroot@router> \r\n\r\n{master:0}\r\nroot@router" (spawn_id exp4) match regular expression "[\r\n]+[^\r\n<]+[#>%] ?$"? (No Gate, RE only) gate=yes re=no
    
    expect: does "\r\n\r\n{master:0}\r\nroot@router> \r\n\r\n{master:0}\r\nroot@router> " (spawn_id exp4) match regular expression "[\r\n]+[^\r\n<]+[#>%] ?$"? (No Gate, RE only) gate=yes re=yes
    expect: set expect_out(0,string) "\r\nroot@router> "
    expect: set expect_out(spawn_id) "exp4"
    expect: set expect_out(buffer) "\r\n\r\n{master:0}\r\nroot@router> \r\n\r\n{master:0}\r\nroot@router> "
    Anything left in the buffer?
    Gate keeper glob pattern for '.+' is ''. Not usable, disabling the performance booster.
    
    expect: does "" (spawn_id exp4) match regular expression ".+"? (No Gate, RE only) gate=yes re=no
    expect: timed out
    

    The expect_out(buffer) is the equivalent of the concatenated "Before" and "match" in the perl's version debug output; in contrast to the perl version, the match result contains everything up to the anchor at end-of-string, and not only the match up the first end-of-line; the rest of the buffer is empty, thus the timeout of the following expect .+ call.

    I think my initial problem is a bug in the perl expect I currently use.

    ~$ dpkg -l|egrep expect
    ii  expect                        5.45.4-2+b1                    i386         Automates interactive applications
    ii  libexpect-perl                1.35-2                         all          Perl Expect interface
    ii  tcl-expect:i386               5.45.4-2+b1                    i386         Automates interactive applications (Tcl package)
    ~$ lsb_release -a
    No LSB modules are available.
    Distributor ID: Debian
    Description:    Debian GNU/Linux 12 (bookworm)
    Release:        12
    Codename:       bookworm
    ~$
    

    So I might be affected by this bug: https://github.com/jacoby/expect.pm/issues/10

    Edit2: I downloaded the patched Expect.pm from the creator of the above mentioned github issue, and it solves my problem:

    Sending '\n\n' to spawn id(3)
     at /usr/share/perl5/Expect.pm line 1413.
            Expect::print(Expect=GLOB(0x1681210), "\x{a}\x{a}") called at ./tmp.pl line 24
    Starting EXPECT pattern matching...
     at /usr/share/perl5/Expect.pm line 597.
            Expect::expect(Expect=GLOB(0x1681210), 10, "-re", qr([\r\n]+[^\r\n<]+[#>%] ?$)) called at ./tmp.pl line 27
    spawn id(3): list of patterns:
      #1: -re `(?^:[\\r\\n]+[^\\r\\n<]+[#>%] ?$)'
    
    
    spawn id(3): Does `'
    match:
      pattern #1: -re `(?^:[\\r\\n]+[^\\r\\n<]+[#>%] ?$)'? No.
    
    
    spawn id(3): Does `\r\n\r\n{mas'
    match:
      pattern #1: -re `(?^:[\\r\\n]+[^\\r\\n<]+[#>%] ?$)'? No.
    
    
    spawn id(3): Does `\r\n\r\n{master:0}\r\n'
    match:
      pattern #1: -re `(?^:[\\r\\n]+[^\\r\\n<]+[#>%] ?$)'? No.
    
    
    spawn id(3): Does `\r\n\r\n{master:0}\r\nroot@rou'
    match:
      pattern #1: -re `(?^:[\\r\\n]+[^\\r\\n<]+[#>%] ?$)'? No.
    
    
    spawn id(3): Does `\r\n\r\n{master:0}\r\nroot@router> \r\n\r'
    match:
      pattern #1: -re `(?^:[\\r\\n]+[^\\r\\n<]+[#>%] ?$)'? No.
    
    
    spawn id(3): Does `\r\n\r\n{master:0}\r\nroot@router> \r\n\r\n{master'
    match:
      pattern #1: -re `(?^:[\\r\\n]+[^\\r\\n<]+[#>%] ?$)'? No.
    
    
    spawn id(3): Does `\r\n\r\n{master:0}\r\nroot@router> \r\n\r\n{master:0}\r\nroo'
    match:
      pattern #1: -re `(?^:[\\r\\n]+[^\\r\\n<]+[#>%] ?$)'? No.
    
    
    spawn id(3): Does `\r\n\r\n{master:0}\r\nroot@router> \r\n\r\n{master:0}\r\nroot@router'
    match:
      pattern #1: -re `(?^:[\\r\\n]+[^\\r\\n<]+[#>%] ?$)'? No.
    
    
    spawn id(3): Does `\r\n\r\n{master:0}\r\nroot@router> \r\n\r\n{master:0}\r\nroot@router> '
    match:
      pattern #1: -re `(?^:[\\r\\n]+[^\\r\\n<]+[#>%] ?$)'? YES!!
        Before match string: `\r\n\r\n{master:0}\r\nroot@router> \r\n\r\n{master:0}'
        Match string: `\r\nroot@router> '
        After match string: `'
        Matchlist: (`')