Search code examples
sed

Escaping a variable with special characters within sed - comment and uncomment an arbitrary line of source code


I need to comment out a line in a crontab file through a script, so it contains directories, spaces and symbols. This specific line is stored in a variable and I am starting to get mixed up on how to escape the variable. Since the line changes on a regular basis I don't want any escaping in there. I don't want to simply add # in front of it, since I also need to switch it around and replace the line again with the original without the #.

So the goal is to replace $line with #$line (comment) with the possibility to do it the other way around (uncomment).

So I have a variable:

line="* * * hello/this/line & /still/this/line"

This is a line that occurs in a file, file.txt. Which needs to get commented out.

First try:

sed -i "s/^${line}/#${line}/" file.txt

Second try:

sed -i 's|'${line}'|'"#${line}"'|g' file.txt

Solution

  • choroba's helpful answer shows an effective solution using perl.


    sed solution

    If you want to use sed, you must use a separate sed command just to escape the $line variable value, because sed has no built-in way to escape strings for use as literals in a regex context:

    lineEscaped=$(sed 's/[^^]/[&]/g; s/\^/\\^/g' <<<"$line") # escape $line for use in regex
    sed -i "s/^$lineEscaped\$/#&/" file.txt # Note the \$ to escape the end-of-line anchor $
    

    With BSD/macOS sed, use -i '' instead of just -i for in-place updating without backup.

    And the reverse (un-commenting):

    sed -i "s/^#\($lineEscaped\)\$/\1/" file.txt
    

    See this answer of mine for an explanation of the sed command used for escaping, which should work with any input string.

    Also note how variable $lineEscaped is only referenced once, in the regex portion of the s command, whereas the substitution-string portion simply references what the regex matched (which avoids the need to escape the variable again, using different rules):
    & in the substitution string represents the entire match, and \1 the first capture group (parenthesized subexpression, \(...\)).

    For simplicity, the second sed command uses double quotes in order to embed the value of shell variable $lineEscaped in the sed script, but it is generally preferable to use single-quoted scripts so as to avoid confusion between what the shell interprets up front vs. what sed ends up seeing.

    For instance, $ is special to both the shell and sed, and in the above script the end-of-line anchor $ in the sed regex must therefore be escaped as \$ to prevent the shell from interpreting it.
    One way to avoid confusion is to selectively splice double-quoted shell-variable references into the otherwise single-quoted script:

    sed -i 's/^'"$lineEscaped"'$/#&/' file.txt
    

    awk solution

    awk offers literal string matching, which obviates the need for escaping:

    awk -v line="$line" '$0 == line { $0 = "#" $0 } 1' file.txt > $$.tmp && mv $$.tmp file.txt
    

    If you have GNU Awk v4.1+, you can use -i inplace for in-place updating.

    And the reverse (un-commenting):

    awk -v line="#$line" '$0 == line { $0 = substr($0, 2) } 1' file.txt > $$.tmp && 
      mv $$.tmp file.txt