Search code examples
pythonobjectpython-3.5bytecode

How is a Python code object marshalled at the C level?


A post by Ned Batchelder back in 12008 made the claim the Python code objects are marshalled in Python bytecode files:

[...] The entire rest of the file is just the output of marshal.dump() of the code object that results from compiling the source file.

This may apply to code objects created at the Python level, but how exactly are these code objects marshalled at the C level?

I've looked over the documentation for Python code objects(version 3.5), and not much is said about them and certainly doesn't provide an information about how their marshalled:

The C structure of the objects used to describe code objects. The fields of this type are subject to change at any time.

I've also tried to take a look at the Python code object's source code, but couldn't really find anything that answered my question.

If Python code objects are implemented using structs at the C level - which I believe they are - how then can they be marshalled into bytecode files? As far as the research I've done, C provides no built in methods for marshalling nor any third-party libraries?

If code objects are marshalled at the C level, how exactly is it done? If not, how are they encoded into the bytecode?

1I especial noted the date because perhaps this detail has changed between Python versions.


Solution

  • This may apply to code objects created at the Python level, but how exactly are these code objects marshalled at the C level?

    With marshal.dump, or perhaps one of the related interfaces like marshal.dumps.

    It's not as simple as "delegate to some C standard library function"; the marshalling and unmarshalling implementation comprises an 1820-line file you'd have to read if you want to see every detail. With how Python-specific its data structures and functionality are, most of the object tree traversal and serialization need to be written from scratch.