Search code examples
pythonpython-2.7escapingstring-formattingpython-2.x

Escaping curly braces in string to be formatted an undefined number of times


Related:

You use curly braces ( {•••} ) to denote parts of strings to be formatted. If you want to use literal curly brace characters so that they're ignored by .format(), you use double curly braces ( {{•••}} ). MCVE:

string = "{format} {{This part won't be formatted. The final string will have literal curly braces here.}}"
print string.format(format='string')

If you have a chain of .format()s, you double the number of braces for each time you use .format(). The last one is surrounded by 4 braces and ends up with literal curly braces in the final output. MCVE:

string = "{format1} {{format2}} {{{{3rd one won't be formatted. The final string will have literal curly braces here.}}}}"
print string.format(format1='string1').format(format2='string2')

It's also possible to format another format string into a format string. The last one is surrounded by 4 braces and ends up with literal curly braces in the final output. MCVE:

string = "{format1} {{{{3rd one won't be formatted. The final string will have literal curly braces here.}}}}"
print string.format(format1='{format2}').format(format2='string')

The problem arises when you use a chain of .format()s depending on conditions determined at runtime. If you want a set of the curly braces to be escaped as literal characters, how many do you use? MCVE:

string = "{} {{{{{{Here I don't know exactly how many curly braces to use because this string is formatted differently due to conditions that I have no control over.}}}}}}"

if fooCondition:
    string = string.format('{} bar')
    if barCondition:
        string = string.format('{} baz')
        if bazCondition:
            string = string.format('{} buzz')
string = string.format('foo')

print string

The 1st part of the string has 4 possible outputs:

  1. foo

  2. foo bar

  3. foo baz bar

  4. foo buzz baz bar

The 2nd part of the string ends up with a different number of curly braces depending on how many conditions are True. I want the 2nd part's curly braces to stay permanently escaped, like not "shed a layer" every time .format() is called. I can solve the problem like this, MCVE:

string = "{} {{DRY - Don't repeat yourself!}}"

if fooCondition:
    string = string.format('{} bar').replace("{DRY - Don't repeat yourself!}", "{{DRY - Don't repeat yourself!}}")
    if barCondition:
        string = string.format('{} baz').replace("{DRY - Don't repeat yourself!}", "{{DRY - Don't repeat yourself!}}")
        if bazCondition:
            string = string.format('{} buzz').replace("{DRY - Don't repeat yourself!}", "{{DRY - Don't repeat yourself!}}")
string = string.format('foo')

print string

But that's duplicate code (bad practice).

The MCVEs aren't my real code. My real code runs on a Google App Engine web server. It's super long and complex. I'm working with HTML, CSS, and JavaScript in strings. I want to insert content into the HTML via .format() without messing up the curly braces of CSS and JS. My current implementation is un-scalable and very error-prone. I have to manage up to 5 consecutive curly braces (like this: {{{{{•••}}}}} ) to pass through .format() chains untouched. I need to periodically re-insert curly braces into strings that aren't formatted a fixed number of times. What's an elegant way to fix this spaghetti code?

How to PERMANENTLY escape curly braces in Python format string?


Solution

  • I put together a partialformat function (in python3.x) that overrides the string format method to allow you to format only those sections of the string that require formatting. edit: I've included a python 2 version as well.

    ## python 3x version
    import string
    from _string import formatter_field_name_split
    ################################################################################
    def partialformat(s: str, recursionlimit: int = 10, **kwargs):
        """
        vformat does the acutal work of formatting strings. _vformat is the 
        internal call to vformat and has the ability to alter the recursion 
        limit of how many embedded curly braces to handle. But for some reason 
        vformat does not.  vformat also sets the limit to 2!   
    
        The 2nd argument of _vformat 'args' allows us to pass in a string which 
        contains an empty curly brace set and ignore them.
        """
    
        class FormatPlaceholder(object):
            def __init__(self, key):
                self.key = key
    
            def __format__(self, spec):
                result = self.key
                if spec:
                    result += ":" + spec
                return "{" + result + "}"
    
            def __getitem__(self, item):
                return
    
        class FormatDict(dict):
            def __missing__(self, key):
                return FormatPlaceholder(key)
    
        class PartialFormatter(string.Formatter):
            def get_field(self, field_name, args, kwargs):
                try:
                    obj, first = super(PartialFormatter, self).get_field(field_name, args, kwargs)
                except (IndexError, KeyError, AttributeError):
                    first, rest = formatter_field_name_split(field_name)
                    obj = '{' + field_name + '}'
    
                    # loop through the rest of the field_name, doing
                    #  getattr or getitem as needed
                    for is_attr, i in rest:
                        if is_attr:
                            try:
                                obj = getattr(obj, i)
                            except AttributeError as exc:
                                pass
                        else:
                            obj = obj[i]
    
                return obj, first
    
        fmttr = PartialFormatter()
        try:
            fs, _ = fmttr._vformat(s, ("{}",), FormatDict(**kwargs), set(), recursionlimit)
        except Exception as exc:
            raise exc
        return fs
    

    edit: looks like python 2.x has some minor differences.

    ## python 2.x version
    import string
    formatter_field_name_split = str._formatter_field_name_split
    def partialformat(s, recursionlimit = 10, **kwargs):
        """
        vformat does the acutal work of formatting strings. _vformat is the 
        internal call to vformat and has the ability to alter the recursion 
        limit of how many embedded curly braces to handle. But for some reason 
        vformat does not.  vformat also sets the limit to 2!   
    
        The 2nd argument of _vformat 'args' allows us to pass in a string which 
        contains an empty curly brace set and ignore them.
        """
    
        class FormatPlaceholder(object):
            def __init__(self, key):
                self.key = key
    
            def __format__(self, spec):
                result = self.key
                if spec:
                    result += ":" + spec
                return "{" + result + "}"
    
            def __getitem__(self, item):
                return
    
        class FormatDict(dict):
            def __missing__(self, key):
                return FormatPlaceholder(key)
    
        class PartialFormatter(string.Formatter):
            def get_field(self, field_name, args, kwargs):
                try:
                    obj, first = super(PartialFormatter, self).get_field(field_name, args, kwargs)
                except (IndexError, KeyError, AttributeError):
                    first, rest = formatter_field_name_split(field_name)
                    obj = '{' + field_name + '}'
    
                    # loop through the rest of the field_name, doing
                    #  getattr or getitem as needed
                    for is_attr, i in rest:
                        if is_attr:
                            try:
                                obj = getattr(obj, i)
                            except AttributeError as exc:
                                pass
                        else:
                            obj = obj[i]
    
                return obj, first
    
        fmttr = PartialFormatter()
        try:
            fs = fmttr._vformat(s, ("{}",), FormatDict(**kwargs), set(), recursionlimit)
        except Exception as exc:
            raise exc
        return fs
    

    Usage:

    class ColorObj(object):
        blue = "^BLUE^"
    s = '{"a": {"b": {"c": {"d" : {} {foo:<12} & {foo!r} {arg} {color.blue:<10} {color.pink} {blah.atr} }}}}'
    print(partialformat(s, foo="Fooolery", arg="ARRrrrrrg!", color=ColorObj))
    

    Output:

    {"a": {"b": {"c": {"d" : {} Fooolery             & 'Fooolery' Fooolery ARRrrrrrg! ^BLUE^ {color.pink} {blah.atr} }}}}