In Python 2, I would like to evaluate a string that contains the representation of a literal. I would like to do this safely, so I don't want to use eval()
—instead I've become accustomed to using ast.literal_eval()
for this kind of task.
However, I also want to evaluate under the assumption that string literals in plain quotes denote unicode
objects—i.e. the kind of forward-compatible behavior you get with from __future__ import unicode_literals
. In the example below, eval()
seems to respect this preference, but ast.literal_eval()
seems not to.
from __future__ import unicode_literals, print_function
import ast
raw = r""" 'hello' """
value = eval(raw.strip())
print(repr(value))
# Prints:
# u'hello'
value = ast.literal_eval(raw.strip())
print(repr(value))
# Prints:
# 'hello'
Note that I'm looking for a general-purpose literal_eval
replacement—I don't know in advance that the output is necessarily a string object. I want to be able to assume that raw
is the representation of an arbitrary Python literal, which may be a string, or may contain one or more strings, or not.
Is there a way of getting the best of both worlds: a function that both securely evaluates representations of arbitrary Python literals and respects the unicode_literals
preference?
Neither ast.literal_eval
nor ast.parse
offer the option to set compiler flags. You can pass the appropriate flags to compile
to parse the string with unicode_literals
activated, then run ast.literal_eval
on the resulting node:
import ast
# Not a future statement. This imports the __future__ module, and has no special
# effects beyond that.
import __future__
unparsed = '"blah"'
parsed = compile(unparsed,
'<string>',
'eval',
ast.PyCF_ONLY_AST | __future__.unicode_literals.compiler_flag)
value = ast.literal_eval(parsed)