Search code examples
pythonpython-3.xabstract-syntax-tree

How can I parse ast.Assign objects?


I'm trying to extract information on the top level assignments of my python code with ast, but am not too familiar with how to effectively parse it.

For example

import some_module

a = 1
b = some_module.load_number("some_path")

def plus(a,b):
   return a + b

c = plus(a,b)

I would like to be able to extract from this, a list of assignment expressions with data like the following:

{
'targets' : 'a',
'type' : 'static_value',
'value' : 1
},

{
'targets' : 'b',
'type' : 'function',
'value' : 'some_module.load_number'
'inputs' : ['some_path']
},

{
'targets' : 'c',
'type' : 'function',
'value' : 'plus'
'inputs' : ['var.a', 'var.b']
},

I care only about assignments. My current approach would work something like this.

import ast
with open("test_code.py", "rb") as f:
   content = f.read()
code = ast.parse(content)
results = []

def parse_assignment(node):
   targets = [x.id for x in node.targets] #extract variables names being assigned to

   ### Extract assigned values for primitives; only str shown for brevity
   if isinstance(node.value, ast.Str):
      values = [node.value.s]
      type = 'str'
   
   ### Function call version
   elif isinstance(node.value, ast.Call):
      values = [node.value.func.value.id],
      type = ['function']

   return {'targets' : targets, 'values' : values, 'types' : types}

for node in ast.walk(code):
   if isinstance(node, ast.Assign):
      print(node)
      results = results + parse_assignment(node)

I have two problems with the approach shown here:

  1. I do not think ast.walk is a good idea here as it seems to be recursive and may pick up assignments within a function definition or something. I want only top level assignments.

  2. The way I'm parsing the function names seems to be incorrect. In this example I want to parse out some_module.load_number but I instead get some_module. How can I get the full function name from an ast.Call object?


Solution

  • Instead of ast.walk, just iterate over code.body and pick out the Assign objects.

    As for the function name, the Call value has a func attribute whose value is an Attribute node, whose value is the module name and whose attr is the attribute that actually references the function.

    >>> >>> call = code.body[2].value.func
    >>> f'{call.value.id}.{call.attr}'
    'some_module.load_number'
    

    Note that in general, the func attribute of a Call node could be an arbitrary expression whose run-time value is a callable.