Search code examples
phppythonrubyperlshell

does calling a shell command from within a scripting language slow down performance?


When writing python, perl, ruby, or php I'll often use ...

PERL:
`[SHELL COMMAND HERE]`
system("[SHELL]", "[COMMAND]", "[HERE]")

Python
import os
os.system("[SHELL COMMAND HERE]")
from subprocess import call
call("[SHELL]", "[COMMAND]", "[HERE]")

ruby 
`[SHELL COMMAND HERE]`
system("[SHELL COMMAND HERE]")

PHP
shell_exec ( "SHELL COMMAND HERE" )

How much does spawning a subprocess in the shell slow down the performance of a program? For example, I was just writing a script with perl and libcurl, and it was difficult, with all of libcurl's parameters, to get it to work. I stopped using libcurl and just started using curl and the performance seemed to IMPROVE, scripting became much easier, and furthermore, I could run my script on systems that only had basic perl (no cpan modules) and the basic shell utilities installed.

Why is spawning this subshell considered bad programming practice? Should it be, always in theory, much slower than using a specific binding/equivalent library within the language?


Solution

  • The first reason why executing shell commands is bad is maintainability. Context switching between tasks is bad enough without language switching. Security is also a consideration but coding practice will make it less significant (avoid injections, ...)

    There are several factors that impact performance:

    1. Forking a process: This takes a while but in case the code being executed performs well, this becomes less significant.
    2. Optimization becomes impossible: When the control is handed over to another process, the interpreter or compiler cannot perform any optimizations. Also, you cannot perform any optimizations.
    3. Blocking: Shell commands are blocking operations. They will not be scheduled like a native part of the code would.
    4. Parsing: If there is a need to do something about the output, it needs to be parsed. In native code, the data would already be in a relevant data structure. Parsing is also prone to errors.
    5. Command line generation: Generating a command line for an executable may require iterating. Sometimes that takes more cycles than performing the same natively.

    Most of these problems arise when the external command is executed in a loop. It may be easy to find examples where none of these become a problem.