I am writing a python tool that both performs static python code analysis and performs some automated rewriting of that code (like expanding wildcard imports). Part of what I need this tool to do is to extract out all expressions that occur within f-strings, ideally formatted in exactly the same way they appear in the code.
f'{a+b} {c *d}' -> ['a+b', 'c *d']
f'{int("3" * 2):d}' -> ['int("3" * 2)']
The best I could come up with was a solution that returns the expressions, but without preserving the original formatting:
import ast # requires python3.9 for ast.unparse()
def extract(s):
values = ast.parse(f"f'{s}'").body[0].value.values
fvalues = [v for v in values if isinstance(v, ast.FormattedValue)]
return [ast.unparse(t.value) for t in fvalues]
>>> extract('{a+b} {c *d}')
['a + b', 'c * d']
>>> extract('{int("3" * 2):d}')
["int('3' * 2)"]
For the purpose of my tool, I can get some value out of this, but I would really prefer to retain the original formatting. Is there a clean solution for this, asides from replicating the intricate f-string parsing logic that exists in the implementation of the python interpreter? (Where exactly can I find that logic anyways? Python-ast.h
doesn't seem to house it.)
The redbaron module generally promises to parse python code into an abstract syntax tree that retains formatting, but it does not appear to support f-strings.
The documentation for the ast module has a "See also" box at the bottom with some options that should help you, including:
(Some of the others might help too, but I couldn't tell from a brief look.)