To escape characters in bash, Why the syntax is confusing when nesting commands deeply?, I know that there is an alternate approach with $() to nest commands, Just curious, why it is as such when nesting commands using backticks!
For example:
echo `echo \`echo \\\`echo inside\\\`\``
Gives output: inside
But
echo `echo \`echo \\`echo inside\\`\``
Fails with,
bash: command substitution: line 1: unexpected EOF while looking for matching ``'
bash: command substitution: line 2: syntax error: unexpected end of file
bash: command substitution: line 1: unexpected EOF while looking for matching ``'
bash: command substitution: line 2: syntax error: unexpected end of file
echo inside\
My question is that why the number of backslashes required for second level nesting is 3 and why it is not 2. In the above example given, one backslash is used for one level deep and three are used for second-level nesting commands to preserve the literal meaning of the backtick.
The basic problem is that there's no distinction between an open-backtick and a close-backtick. So if the shell sees something like this:
somecommand ` something1 ` something2 ` something3 `
...there's no intrinsic way to tell if that's two separate backticked commands (something1
and something3
), with a literal string ("something2") in between; or a nested backtick expression, with something2
being run first and its output passed to something1
as an argument (along with the literal string "something3"). In order to avoid ambiguity, the shell syntax picks the first interpretation, and requires that if you want the second interpretation you need to escape the inner level of backticks:
somecommand ` something1 ` something2 ` something3 ` # Two separate expansions
somecommand ` something1 \` something2 \` something3 ` # Nested expansions
And that means adding another level of parsing-and-removing escapes, which means you need to escape any escapes you didn't want parsed at that point, and the whole thing gets quickly out of hand.
The $( )
syntax, on the other hand, is not ambiguous, because the opening and closing markers are not the same. Compare the two possibilities:
somecommand $( something1 ) something2 $( something3 ) # Two separate expansions
somecommand $( something1 $( something2 ) something3 ) # Nested expansions
There's no ambiguity there, so no need for escapes or other syntactic weirdness.
The reason the number of escapes grows so fast with the number of levels is again to avoid ambiguity. And it's not something specific to command expansions with backticks; this escape inflation shows up anytime you have a string going through multiple levels of parsing, each of which applies (and removes) escapes.
Suppose the shell runs across two escapes and a backtick (\\`
) as it parses a line. Should it parse that as a doubly-escaped backtick, or a singly-escaped escape (backslash) character followed by a not-escaped-at-all backtick? If it runs across three escapes and a backtick (\\\`
), is that a triply-escaped backtick, a doubly-escaped escape followed a not-escaped-at-all backtick, or a singly-escaped escape followed by a singly-escaped backtick?
The shell (like most things that deal with escapes) avoids the ambiguity by not treating stacked escapes as a special thing. When it runs into an escape character, that applies only to the thing immediately after it; if the thing immediately after it is another escape, then it escapes that one character and has no effect on whatever's after it. Thus \\`
is an escaped escape, followed by a not-escaped-at-all backtick. That means you can't just add another escape to the front, you have to add an escape in front of each and every escape-worthy character in the string (including escapes from lower levels).
So, let's start with a simple backtick, and work through escaping it to various levels:
\'
.\\
) and then separately escape the backtick itself (\`
), giving a total of three backticks: \\\`
.\\
) and once again escape the backtick itself (\`
), giving a total of seven backticks: \\\\\\\`
.It continues like that, more than doubling the number of escapes for each level. From 7 it goes to 15, then 31, then 63, then... There's a good reason people try to avoid situations with deeply nested escapes.
Oh, and as I mentioned, the shell isn't the only thing that does this, and that can complicate matters because different levels can have different escaping syntaxes, and some things may not need escaping at some of the levels. For example, suppose the thing being escaped is the regular expression \s
. To add a level to that, you'd only need one additional escape (\\s
) because the "s" doesn't need to be escaped by itself. Additional levels of escaping on that would give 4, 8, 16, 32 etc escapes.
TLDR; Yo, dawg, I heard you like escapes...
P.s. You can use the shell's -v
option to make it print commands before executing them. With nested commands like this, it'll print each of the commands as it un-nests them, so you can watch the stack escaped escapes collapse as the layers get stripped off:
$ set -v
$ echo "this is `echo "a literal \`echo "backtick: \\\\\\\`" \`" `"
echo "this is `echo "a literal \`echo "backtick: \\\\\\\`" \`" `"
echo "a literal `echo "backtick: \\\`" `"
echo "backtick: \`"
this is a literal backtick: `
(For even more fun, try this after set -vx
-- the -x
option will print the commands after parsing, so after you see it drill into the nested commands, you'll then see what happens as it unwinds back out to the final top-level command.)