Search code examples
pythonpython-re

Python Regex Expression Needed to Add Only a 2nd Backslash


I have an expression that works in sed and need to adopt it for python. I want to insert a backslash next to each "single" backslash. For clarity here, I am replacing an isolated backslash with an "X". Here is what works in sed. Remember that sed actually sees the input as aaa\bbb\ccc. I would like the output to be "aaa\bbb\ccc".

echo "aaa\\bbb\\\\ccc" | sed -e 's/\([^\\]\)\\\([^\\]\)/\1X\2/'

I tried a few things like:

re.sub("([^\\])\\([^\\])", "\1X\2", r"123\456\\789")
r"123\456\\789".replace(r"([^\])\([^\])", "\1X\2")

Solution

  • Use the r notation also for the other arguments that you pass to re.sub -- that way the string goes as-is (with all backslashes) to the regex engine (which uses backslash escaping also).

    So:

    s = re.sub(r"([^\\])\\([^\\])", r"\1X\2", r"123\456\\789")
    

    Or, with X replaced with two literal backslashes:

    s = re.sub(r"([^\\])\\([^\\])", r"\1\\\\\2", r"123\456\\789")
    

    There is a boundary case where this regular expression does not do the job right (if I understand your purpose correctly): when the isolated backslash is the very first or very last character in the string, it will not be doubled. If you need that to happen, use look-around:

    s = re.sub(r"(?<!\\)\\(?!\\)", r"\\\\", r"123\456\\789")