I want to write a Python program that analyzes the execution of other arbitrary Python programs.
For example, suppose I have a Python script called main.py
that calls a function func
a certain number of times. I want to create another script called analyzer.py
that can "look inside" main.py
while it's running and record how many times func
was called. I also want to record the list of input arguments passed to func
, and the return value of func
each time it was called.
I cannot modify the source code of main.py
or func
in any way. Ideally analyzer.py
would work for any python program, and for any function.
The best way I have found to accomplish this is to have analyzer.py
run main.py
as a subprocess using pdb.
script = "main.py"
process = subprocess.Popen(['python', '-m', 'pdb', script], stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
I can then send pdb commands to the program via the process' stdin and then read the output via stdout.
To retrieve the input parameters and return values of func
, I need to
func
by analyzing its filelocals()
, and print to stdout (to get input parameters)__return__
and print to stdoutI'm wondering if there is a better way to accomplish this
Instead of controlling pdb with pipes, you can just configure your own trace function using sys.settrace
before doing import main
. (Of course you can also do importlib.import_module("main")
or runpy.run_module()
or runpy.run_path()
.)
For instance,
import sys
def trace(frame, event, args):
if event == "call":
print(frame.f_code.co_name, frame.f_locals)
sys.settrace(trace)
# (this is where you'd `import main` to cede control to it)
def func(a, b, c):
return a + b + c
func(1, 2, 3)
func("a", "b", "c")
prints out
func {'a': 1, 'b': 2, 'c': 3}
func {'a': 'a', 'b': 'b', 'c': 'c'}