Search code examples
pythonnode.jspexpectcprofile

pexpect sendline is too slow


I'm running a subprocess using pexpect. I'm communicating with it off and on via sendline and readline. cProfile indicates my program spends most of its time waiting around for sendline to finish, which is not desired behavior. Python pseudocode:

def reset(self):
  self.proc = pexpect.spawn("node ./my_lovely_program.js")
...
def step(self, info):
    self.proc.sendline(info)
    while (my_condition):
        response = self.proc.readline().decode()
        # evaluate my_condition
    return retval

Meanwhile, in the style of Waiting for user to enter input in Node.js, buried well into my node application is:

let ans = await this.askQuestion("");
// do_stuff(ans);

This gets run by two different instances of the relevant class at the same time. This is all well and good and runs satisfactorily. Then I run

python3 -m cProfile -s cumtime main.py

And get this:

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    498/1    0.002    0.000  262.820  262.820 {built-in method builtins.exec}
        1    0.107    0.107  262.820  262.820 main loop
        1    0.099    0.099  262.081  262.081 also main loop
     3774    0.022    0.000  214.272    0.057 main.py:24(step)
    33596  203.977    0.006  203.977    0.006 {built-in method time.sleep}
     3774    0.017    0.000  189.138    0.050 pty_spawn.py:570(sendline)
     3774    0.049    0.000  189.105    0.050 pty_spawn.py:526(send)
     3875    0.158    0.000   53.767    0.014 main.py:28(scrape_input)
    14756    0.052    0.000   53.434    0.004 spawnbase.py:459(readline)
    14756    0.056    0.000   53.382    0.004 spawnbase.py:240(expect)
    14756    0.072    0.000   52.985    0.004 spawnbase.py:343(expect_list)
    14756    0.242    0.000   52.848    0.004 expect.py:91(expect_loop)
    29721    0.235    0.000   47.113    0.002 pty_spawn.py:415(read_nonblocking)
    69657    0.113    0.000   46.250    0.001 pty_spawn.py:448(select)
    69657    0.103    0.000   46.137    0.001 utils.py:130(select_ignore_interrupts)
    69657   46.016    0.001   46.016    0.001 {built-in method select.select}
...

Between them, readline and especially sendline take up over 90% of the programs runtime. Most of that time is spent sleeping. This is very sad and I want to fix it. I'm sending 20-30 characters each time. I've tried reading from stdin in other ways (e.g. https://www.dev2qa.com/node-js-get-user-input-from-command-line-prompt-example/) and am ready to keep going down that route if that's the way to go. (The issue I am having there is a separate one; on the second communication, the input gets read twice). Posix_spawn performance when capturing process output used popen; I remember using that several months ago and it deadlocking.


Solution

  • pexpect has a default 50ms wait time for send(). You can disable this by setting:

    proc.delaybeforesend = None
    

    See the documentation for an explanation of why they put this delay in.