There are many questions on SO about using Python's eval on insecure strings (eg.: Security of Python's eval() on untrusted strings?, Python: make eval safe). The unanimous answer is that this is a bad idea.
However, I found little information on which strings can be considered safe (if any). Now I'm wondering if there is a definition of "safe strings" available (eg.: a string that only contains lower case ascii chars or any of the signs +-*/()). The exploits I found generally relied on either of _.,:[]'" or the like. Can such an approach be secure (for use in a graph painting web application)?
Otherwise, I guess using a parsing package as Alex Martelli suggested is the only way.
EDIT: Unfortunately, there are neither answers that give a compelling explanation for why/ how the above strings are to be considered insecure (a tiny working exploit) nor explanations for the contrary. I am aware that using eval should be avoided, but that's not the question. Hence, I'll award a bounty to the first who comes up with either a working exploit or a really good explanation why a string mangled as described above is to be considered (in)secure.
Here you have a working "exploit" with your restrictions in place - only contains lower case ascii chars or any of the signs +-*/() . It relies on a 2nd eval layer.
def mask_code( python_code ):
s="+".join(["chr("+str(ord(i))+")" for i in python_code])
return "eval("+s+")"
bad_code='''__import__("os").getcwd()'''
masked= mask_code( bad_code )
print masked
print eval(bad_code)
output:
eval(chr(111)+chr(115)+chr(46)+chr(103)+chr(101)+chr(116)+chr(99)+chr(119)+chr(100)+chr(40)+chr(41))
/home/user
This is a very trivial "exploit". I'm sure there's countless others, even with further character restrictions. It bears repeating that one should always use a parser or ast.literal_eval(). Only by parsing the tokens can one be sure the string is safe to evaluate. Anything else is betting against the house.