I want to use repr()
to get a Python-encoded string literal (that I can paste into some source code), but I'd prefer a triple-quoted string with real newlines rather than the \n
escape sequence.
I could post-process the string to convert \n
back into a newline char and add a couple more quotes, but then if \\n
is in the source, then I wouldn't want to match on that.
What's the easiest way to do this?
Example input:
foo💩
bar
Or as a Python string:
'foo💩\nbar'
Desired output:
'''foo\xf0\x9f\x92\xa9
bar'''
Triple-single or triple-double quotes is fine, but I do want it broken on multiple lines like that.
What I have so far:
#!/usr/bin/env python
import sys
import re
with open(sys.argv[1], 'r+') as f:
data = f.read()
f.seek(0)
out = "''" + re.sub(r"\\n", '\n', repr(data)) + "''"
f.write(out)
f.truncate()
I'm still trying to figure out the regex to avoid converting escaped \n
s.
The goal is that if I paste that back into a Python source file I will get back out exactly the same thing as I read in.
I'm using Python 2.7.14
How about splitlines
it and encoding each line separately:
s = 'foo💩\nbar'
r = "'''" + '\n'.join(repr(x)[1:-1] for x in s.splitlines()) + "'''"
assert eval(r) == s
If you're on python2 and the inputs are unicode, then repr[2:-1]
to strip the leading u
as well. The same applies to py3 and bytes inputs.