Search code examples
pythonregexbackspacecontrol-characters

Apply formatting control characters (backspace and carriage return) to string, without needing recursion


What is the easiest way to "interpret" formatting control characters in a string, to show the results as if they were printed. For simplicity, I will assume there are no newlines in the string.

So for example,

>>> sys.stdout.write('foo\br')

shows for, therefore

interpret('foo\br') should be 'for'

>>>sys.sdtout.write('foo\rbar')

shows bar, therefore

interpret('foo\rbar') should be 'bar'


I can write a regular expression substitution here, but, in the case of '\b' replacement, it would have to be applied recursively until there are no more occurrences. It would be quite complex if done without recursion.

Is there an easier way?


Solution

  • UPDATE: after 30 minutes of asking for clarifications and an example string, we find the question is actually quite different: "How to repeatedly apply formatting control characters (backspace) to a Python string?" In that case yes you apparently need to apply the regex/fn repeatedly until you stop getting matches. SOLUTION:

    import re
    
    def repeated_re_sub(pattern, sub, s, flags=re.U):
        """Match-and-replace repeatedly until we run out of matches..."""
        patc = re.compile(pattern, flags)
    
        sold = ''
        while sold != s:
            sold = s
            print "patc=>%s<    sold=>%s<   s=>%s<" % (patc,sold,s)
            s = patc.sub(sub, sold)
            #print help(patc.sub)
    
        return s
    
    print repeated_re_sub('[^\b]\b', '', 'abc\b\x08de\b\bfg')
    #print repeated_re_sub('.\b', '', 'abcd\b\x08e\b\bfg')
    

    [multiple previous answers, asking for clarifications and pointing out that both re.sub(...) or string.replace(...) could be used to solve the problem, non-recursively.]