In some cases, when I load an existing pickle file, and after that dump it again, the size is almost halved.
I wonder why, and the first suspect is the protocol version. Can I somehow find out with which protocol version a file was pickled?
There may be a more elegant way but to get down to the metal you can use pickletools
:
import pickle
import pickletools
s = pickle.dumps('Test')
proto_op = next(pickletools.genops(s))
assert proto_op[0].name == 'PROTO'
proto_ver = proto_op[1]
To figure out the version required to decode this, you'll need to maximum protocol version of each opcode:
proto_ver = max(op[0].proto for op in pickletools.genops(s))