I wrote a script that accessed a bunch of servers using nc on the command line, and originally I was using Python's commands module and calling commands.getoutput(). The script ran in about 45 seconds. Since commands is deprecated, I want to change everything over to using the subprocess module, but now the script takes 2 min 45 secs to run. Why would this be?
What I had before:
output = commands.getoutput("echo get file.ext | nc -w 1 server.com port_num")
Now I have
p = Popen('echo get file.ext | nc -w 1 server.com port_num', shell=True, stdout=PIPE)
output = p.communicate()[0]
I would expect subprocess
to be slower than command
. Without meaning to suggest that this is the only reason your script is running slowly, you should take a look at the commands
source code. There are fewer than 100 lines, and most of the work is delegated to functions from os
, many of which are taken straight from C POSIX libraries (at least in POSIX systems). Note that commands
is Unix-only, so it doesn't have to do any extra work to ensure cross-platform compatibility.
Now take a look at subprocess
. There are more than 1500 lines, all pure Python, doing all sorts of checks to ensure consistent cross-platform behavior. Based on this, I would expect subprocess
to run slower than commands
.
I timed the two modules, and on something quite basic, subprocess
was almost twice as slow as commands
.
>>> %timeit commands.getoutput('echo "foo" | cat')
100 loops, best of 3: 3.02 ms per loop
>>> %timeit subprocess.check_output('echo "foo" | cat', shell=True)
100 loops, best of 3: 5.76 ms per loop
Swiss suggests some good improvements that will help your script's performance. But even after applying them, note that subprocess
is still slower.
>>> %timeit commands.getoutput('echo "foo" | cat')
100 loops, best of 3: 2.97 ms per loop
>>> %timeit Popen('cat', stdin=PIPE, stdout=PIPE).communicate('foo')[0]
100 loops, best of 3: 4.15 ms per loop
Assuming you are performing the above command many times in a row, this will add up, and account for at least some of the performance difference.
In any case, I am interpreting your question as being about the relative performance of subprocess
and command
, rather than being about how to speed up your script. For the latter question, Swiss's answer is better.