Why does Python 3's regular string substitution swallow characters?

import re

base_path = "c:\\five"
print(base_path)
filename = "<data>\\a.txt"
filename = re.sub(r'(?i)<data>', base_path, filename) 
print(filename)

Output:

c:\five
c:
ive\a.txt

Normally it should be: c:\five\a.txt.

The same code doesn't do this in Python 2.

Changing it to something like the following results in the same thing.

reg = re.compile(re.escape('<data_path>'), re.IGNORECASE)
filename = reg.sub(base_path, filename)

Solution

When c:\\five passes through re.sub it becomes c:\five (containing a \f form-feed character). It's a bit weird that it does this in the replacement string, but you can double-escape the backslashes as c:\\\\five to work around it. Or you can pass the replacement as a function, which will avoid this bit of regex-processing:

base_path = "c:\\five"
filename = "<data>\\a.txt"
filename = re.sub(r'(?i)<data>', lambda _: base_path, filename) 
print(filename)

Output: c:\five\a.txt

See the docs for details:

repl can be a string or a function; if it is a string, any backslash escapes in it are processed. That is, \n is converted to a single newline character, \r is converted to a carriage return, and so forth.