Search code examples
rubyescapingsystem

Why does ruby’s `system` not `shellescape` the first argument?


When referenced in a shell, directory names with parenthesis such as:

/tmp/(example)

need to be escaped like:

/tmp/\(example\)

When I reference them in a system call in ruby, I have to escape them or not depending on whether they are the first argument or not.

  • Unescaped directory as the first argument. Failure.

    system('/tmp/(example)/script')
    #>> sh: -c: line 0: syntax error near unexpected token `example'
    #>> sh: -c: line 0: `/tmp/(example)/script'
    #=> false
    
  • Escaped directory as the first argument. Success.

    system('/tmp/(example)/script'.shellescape)
    #=> true
    
  • Unescaped directory as the second argument. Success.

    system('touch', '/tmp/(example)/script')
    #=> true
    
  • Escaped directory as the second argument. Failure.

    system('touch', '/tmp/(example)/script'.shellescape)
    #>> touch: /tmp/\(example\)/script: No such file or directory
    #=> false
    

It seems that system escapes the name of every argument but the command (first argument). That is an issue in scripts whose command has a relative path such as:

system("#{__dir__}/something")
  1. Why does system behave this way?
  2. Is there a native option to make it escape everything?

Solution

  • Blindly sending a whole command through #shellescape rarely does the right thing. Consider this:

    > puts '(pancakes) --house'.shellescape
    \(pancakes\)\ --house
    

    Sure, you probably want to escape the parentheses but you almost certainly don't want to escape the space. In order to do The Right Thing and DWIM (Do What I Mean), #system would have to make guesses about which parts should be escaped and which parts shouldn't and it would invariably do the wrong thing half the time.

    The solution is to never ever use the single argument form of #system so that you never invoke a shell at all. If you say things like:

    system('/bin/ls -l')
    

    then a shell is invoked with that command line and the shell has to parse it before invoking the command. If you intend say:

    system('/bin/ls', '-l')
    

    then /bin/ls is invoked directly, no shell is involved so there's no escaping to worry about. Of course, that sometimes leads to silly things like:

    system('/bin/ls', '--')
    system(['/bin/ls', '--'])
    

    invoke a command without arguments and without a shell but presumably that's not very common and you can fall back on manual escaping when you know that's what you're dealing with.

    I tend to pretend that #system doesn't exist and go straight to Open3 when I need to interact with external programs. Open3 is slightly more verbose but has a cleaner and easier to use interface IMO.

    Similar arguments apply to using backticks to execute a command and capture its output.


    Your second example:

    # Unescaped directory as second argument. Success.
    > system('touch', '/tmp/(example)/script')
    => true
    

    works because no shell is invoked so the parentheses in the second argument have no special meaning.

    Your third example:

    # Escaped directory as second argument. Failure.
    > system('touch', '/tmp/(example)/script'.shellescape)
    touch: /tmp/\(example\)/script: No such file or directory
    => false
    

    fails for the same reason: no shell means no escaping.