Search code examples
pythoncompiler-constructionintrospection

Embed Python in Python?


I wrote a "compiler" PypTeX that converts an input file a.tex containing Hello @{3+4} to an ouput file a.pyptex containing Hello 7. I evaluate arbitrary Python fragments like @{3+4} using something like eval(compile('3+4','a.tex',mode='eval'),myglobals), where myglobals is some (initially empty) dict. This creates a thin illusion of an embedded interpreter for running code in a.tex, however the call stack when running '3+4' looks pretty weird, because it backs up all the way into the PypTeX interpreter, instead of topping out at the user code '3+4' in a.tex.

Is there a way of doing something like eval but chopping off the top of the call stack?

Motivation: debugging

Imagine an exception is raised by the Python fragment deep inside numpy, and pdb is launched. The user types up until they reach the scope of their user code and then they type list. The way I've done it, this displays the a.tex file, which is the right context to be showing to the user and is the reason why I've done it this way. However, if the user types up again, the user ends up in the bowels of the PypTeX compiler.

An analogy would be if the g++ compiler had an error deep in a template, displayed a template "call stack" in its error message, but that template call stack backed all the way out into the bowels of the actual g++ call stack and exposed internal g++ details that would only serve to confuse the user.

Embedding Python in Python

Maybe the problem is that the illusion of the "embedded interpreter" created by eval is slightly too thin. eval allows to specify globals, but it inherits whatever call stack the caller has, so if one could somehow supply eval with a truncated call stack, that would resolve my problem. Alternatively, if pdb could be told "you shall go no further up" past a certain stack frame, that would help too. For example, if I could chop off a part of the stack in the traceback object and then pass it to pdb.post_mortem().

Or if one could do from sys import Interpreter; foo = Interpreter(); foo.eval(...), meaning that foo is a clean embedded interpreter with a distinct call stack, global variables, etc..., that would also be good.

Is there a way of doing this?

A rejected alternative

One way that is not good is to extract all Python fragments from a.tex by regular expression, dump them into a temporary file a.py and then run them by invoking a fresh new Python interpreter at the command line. This causes pdb to eventually top out into a.py. I've tried this and it's a very bad user experience. a.py should be an implementation detail; it is automatically generated and will look very unfamiliar to the user. It is hard for the user to figure out what bits of a.py came from what bits of a.tex. For large documents, I found this to be much too hard to use. See also pythontex.


Solution

  • I think I found a sufficient solution:

    import pdb, traceback
    
    def exec_and_catch(cmd,globals):
      try:
        exec(cmd,globals)            # execute a user program
      except Exception as e:
        tb = e.__traceback__.tb_next # if user program raises an exception, remove the
        f = e.with_traceback(tb)     # top stack frame and return the exception.
        return f
      return None                    # otherwise, return None signifying success.
    
    foo = exec_and_catch("import module_that_does_not_exist",{})
    if foo is not None:
        traceback.print_exception(value=foo, tb=foo.__traceback__, etype=type(foo))
        pdb.post_mortem(foo.__traceback__)