Search code examples
pythonpandasmultiple-columnsprefix

Add prefix failed with percentage


df = pd.DataFrame({'a':[1,4], 'b':[7,8]})
print (df)
   a  b
0  1  7
1  4  8

I try add % to columns names, so use DataFrame.add_prefix

print (df.add_prefix('%'))

TypeError: not all arguments converted during string formatting

What can be problem?

How is possible use add_prefix function with % ?

Or it is only bug?

Notice:

Possible solution is like:

df.columns = ['%' + col for col in df.columns]
print (df)
   %a  %b
0   1   7
1   4   8

but I am interested about function add_prefix.


Solution

  • NOTE:
    This is likely to change in the near future as pandas will use new style string formatting in this case. When that happens:

    '{}hello'.format('%')
    
    '%hello'
    

    adding a prefix with a single % will work just fine.

    See github


    Answer
    Two percent signs! One escapes the other when using the old style of string formatting.

    df.add_prefix('%%')
    
       %a  %b
    0   1   7
    1   4   8
    

    The clue came from:

       2856     def add_prefix(self, prefix):
       2857         f = (str(prefix) + '%s').__mod__
    -> 2858         return self.rename_axis(f, axis=0)
       2859 
       2860     def add_suffix(self, suffix):
    

    So I tried it myself

    f = ('my_prefix_' + '%s').__mod__
    f('hello')
    
    'my_prefix_hello'
    

    And then

    f = ('%' + '%s').__mod__
    f('hello')
    
    ---------------------------------------------------------------------------
    TypeError                                 Traceback (most recent call last)
    <ipython-input-901-0c92e5498bbc> in <module>()
          1 f = ('%' + '%s').__mod__
    ----> 2 f('hello')
    
    TypeError: not all arguments converted during string formatting
    

    So I looked up how to escape the '%' in the old style of string formatting and found this answer

    Which led to this

    f = ('%%' + '%s').__mod__
    f('hello')
    
    '%hello'