python serialization docker pickle jsonpickle

Generic Python Object Serialization for Docker Integration

I am working on a project with the objective of separating the processes of training and testing in machine learning projects. I designed the code to wrap the used model, and by model I mean a classifier for instance, in the class Model.

class Model: def init(self, newModel): self.model = newModel

Then I pass the function objects the model has to provide using a list:

def addFunctions(self,functions): for function in functions: self.functions[function.__name_ _] = function

Now that model can be used for classification for instance by constructing it with a classifier object and passing its functions in a list to addFunctions so that I can invoke them. Then I package the model and the code in a docker container. To simplify what it does, it is a lightweight virtual machine.

The purpose of the separation is to just pass the trained model to the docker container after optimizing it without the need of passing the whole code. Thus, the need for saving/serializing the Python Model arises.

I tried using pickle as well as jsonpickle, but both of them had limitations when serializing certain types of objects. I could not find any alternative that is generic enough for object storage and retrieval. Are there any alternatives?

Solution

Both dill and cloudpickle are very robust serializers, and can serialize almost any objects in standard python. (I'm the dill author, btw.)

dill is available as a standalone package at: https://github.com/uqfoundation/dill/

while cloudpickle has pretty much died (it was supported by picloud, but they went commercial… and has left pyspark and a few other packages supporting it inside their own codebases): https://github.com/apache/spark/blob/master/python/pyspark/cloudpickle.py

I use dill as the backbone of parallel and distributed computing in statistical computing and optimization, and have used it to enable parallel Machine Learning techniques. I haven't tried docker objects however.