Search code examples
pythonstringrepr

Python repr string w/ real newlines


I want to use repr() to get a Python-encoded string literal (that I can paste into some source code), but I'd prefer a triple-quoted string with real newlines rather than the \n escape sequence.

I could post-process the string to convert \n back into a newline char and add a couple more quotes, but then if \\n is in the source, then I wouldn't want to match on that.

What's the easiest way to do this?


Example input:

foo💩
bar

Or as a Python string:

'foo💩\nbar'

Desired output:

'''foo\xf0\x9f\x92\xa9
bar'''

Triple-single or triple-double quotes is fine, but I do want it broken on multiple lines like that.


What I have so far:

#!/usr/bin/env python
import sys
import re

with open(sys.argv[1], 'r+') as f:
    data = f.read()
    f.seek(0)
    out = "''" + re.sub(r"\\n", '\n', repr(data)) + "''"
    f.write(out)
    f.truncate()

I'm still trying to figure out the regex to avoid converting escaped \ns.

The goal is that if I paste that back into a Python source file I will get back out exactly the same thing as I read in.


I'm using Python 2.7.14


Solution

  • How about splitlines it and encoding each line separately:

    s = 'foo💩\nbar'
    
    r = "'''" + '\n'.join(repr(x)[1:-1] for x in s.splitlines()) + "'''"
    
    assert eval(r) == s
    

    If you're on python2 and the inputs are unicode, then repr[2:-1] to strip the leading u as well. The same applies to py3 and bytes inputs.