Search code examples
javaobjectserializationdeserializationrmi

Is there are difference between RMI and plain object serialization?


It seems a little bit strange to me. I have two classes communicating via RMI. If I transfer an object graph with RMI on the producing system, then at the destination some data is missing.

To analyse this deeper I serialized the object graph to the disc before returning this over RMI to the caller. On the caller side I wrote the returned object to the disc too. Both files differ in size.

Transferring them back to an object graph with a small helper I recognize that some objects are missing after the data was transmitted to the caller on the caller's side.

Honestly I find this quite strange and I do not have a clue what is going on. Since the same thing is working on my developer machine. I expected a SerializationException, but none was thrown.

I am lost ;-(. I expected if a object serialization to disc works it works for RMI too.

Does anyone knows a first step to debug this behaviour?


Solution

  • One of your comments contained an important clue that leads me to hypothesize about what might be going on. If the classes on the different systems are indeed different, then there are some ways that they can differ such that serialization and deserialization can result in loss of information, with no exception or errors being reported.

    For there to be no errors, both sides must have the same set of serializable classes (by name), and if those classes differ, they must declare identical serialVersionUID values. If serializable data fields have been added or removed, this can result in data being dropped without error. This can easily occur if no custom serializable form is implemented (readObject and writeObject methods), but it can still occur even if these methods are implemented and proper care hasn't been taken during the evolution of the class.

    Here's an example. Suppose there is class A that is originally defined as follows:

    // original version
    class A implements Serializable {
        private static final long serialVersionUID = 9783425L;
        String x;
        String y;
    }
    

    Now let's say that in the next version, another field was added to this class:

    // modified version
    class A implements Serializable {
        private static final long serialVersionUID = 9783425L;
        String x;
        String y;
        String z;
    }
    

    Suppose now that one machine has the modified version of class A and it has serialized some instances of this class. It now transmits this serialized data (either via RMI or through some other means, it doesn't actually matter) to another machine that happens to have the original version of class A. When this machine deserializes instances of A, serialized data in the new field z will be dropped silently!

    Sort-of the opposite happens when going in the other direction. If the machine with the original version of A were to transmit serialized data to the machine with the new version, upon deserialization, there would be no data for the z field of the new version of the class, and so the z field would be left at its default value, which is null.

    This is described in Section 3.1 of the Serialization Specification, where it talks about the defaultReadObject method:

    Any field of the object that does not appear in the stream is set to its default value. Values that appear in the stream, but not in the object, are discarded. This occurs primarily when a later version of a class has written additional fields that do not occur in the earlier version.

    I don't think this is the only possible way that the serialized data could differ between the systems, but if the classes differ, then this might be what's going on.

    How to debug this? Since you've written the serialized bytes to disk and they differ, you can pick through the encoded serialized form and see what the differences are. The serialization data format isn't terribly complex but it is quite tedious.

    Another approach is to examine the classes on the different systems using javap -private. You need -private because even private fields can be serialized. This may tell you if any fields were added or removed between the different versions on the different systems.