Search code examples
pythonbytecode-manipulationgenetic-programming

How to validate Python bytecode?


I'm thinking to do some bytecode manipulation (think genetic programming) in Python.

I came across a test case in crashers test section of Python source tree that states:

Broken bytecode objects can easily crash the interpreter. This is not going to be fixed.

Thus the question, how to validate given tweaked byte code that it will not crash interpreter? Is it even possible?

Test source, after http://nedbatchelder.com/blog/201206/eval_really_is_dangerous.html

cc = (lambda fc=(
    lambda n: [
        c for c in
            ().__class__.__bases__[0].__subclasses__()
            if c.__name__ == n
        ][0]
    ):
    fc("function")(
        fc("code")(
            0, 0, 0, 0, "KABOOM", (), (), (), "", "", 0, ""
        ), {}
    )()
)

Here, this module defines cc that, if called, mymod.cc() crashes interpreter. Granted this is a very tricky example that created new code object with custom bytecode "KABOOM" and then runs it.

I'd accept something that verifies predefined bytecode, e.g. from a .pyc file.


Solution

  • Both outdated, the first one without code (at least I can't find) but may be useful to give an idea of what/how can be done and what are the limitations.

    perfectly valid bytecode can still do horrible things