Search code examples
javaserializationeffective-java

Effective in Java Item 89: For instance control, prefer enum types to readResolve - Why can the stealer change the field of the original instance?


A serializable class is defined as

public class Elvis implements Serializable {
    public static final Elvis INSTANCE = new Elvis();
    private Elvis() { }
    private String[] favoriteSongs ={ "Hound Dog", "Heartbreak Hotel" };
    public void printFavorites() {
        System.out.println(Arrays.toString(favoriteSongs));
    }
    private Object readResolve() {
    return INSTANCE;
    }
}

A stealer class is defined as

public class ElvisStealer implements Serializable {
    static Elvis impersonator;
    private Elvis payload;

    private Object readResolve() {
        // Save a reference to the "unresolved" Elvis instance
        impersonator = payload;
        // Return object of correct type for favoriteSongs field
        return new String[] { "A Fool Such as I" };
    }

    private static final long serialVersionUID =0;
}

Finally, here is the program.

public class ElvisImpersonator {
// Byte stream couldn't have come from a real Elvis instance!
    private static final byte[] serializedForm = {
        (byte)0xac, (byte)0xed, 0x00, 0x05, 0x73, 0x72, 0x00, 0x05,
        0x45, 0x6c, 0x76, 0x69, 0x73, (byte)0x84, (byte)0xe6,
        (byte)0x93, 0x33, (byte)0xc3, (byte)0xf4, (byte)0x8b,
        0x32, 0x02, 0x00, 0x01, 0x4c, 0x00, 0x0d, 0x66, 0x61, 0x76,
        0x6f, 0x72, 0x69, 0x74, 0x65, 0x53, 0x6f, 0x6e, 0x67, 0x73,
        0x74, 0x00, 0x12, 0x4c, 0x6a, 0x61, 0x76, 0x61, 0x2f, 0x6c,
        0x61, 0x6e, 0x67, 0x2f, 0x4f, 0x62, 0x6a, 0x65, 0x63, 0x74,
        0x3b, 0x78, 0x70, 0x73, 0x72, 0x00, 0x0c, 0x45, 0x6c, 0x76,
        0x69, 0x73, 0x53, 0x74, 0x65, 0x61, 0x6c, 0x65, 0x72, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x02, 0x00, 0x01,
        0x4c, 0x00, 0x07, 0x70, 0x61, 0x79, 0x6c, 0x6f, 0x61, 0x64,
        0x74, 0x00, 0x07, 0x4c, 0x45, 0x6c, 0x76, 0x69, 0x73, 0x3b,
        0x78, 0x70, 0x71, 0x00, 0x7e, 0x00, 0x02
    };

    public static void main(String[] args) {
        // Initializes ElvisStealer.impersonator and returns
        // the real Elvis (which is Elvis.INSTANCE)
        Elvis elvis = (Elvis) deserialize(serializedForm);
        Elvis impersonator = ElvisStealer.impersonator;
        elvis.printFavorites();
        impersonator.printFavorites();
    }
}

Here is the output of the program.

[Hound Dog, Heartbreak Hotel]
[A Fool Such as I]

Here is the structralized serialized form. enter image description here

While I read this part,

  1. I don't know why the return of the method readResolve() of ElvisStealer can change the favoriteSongs field of impersonator?
  2. Why 0x71, 0x00, 0x7e, 0x00, 0x02 is the reference of Elvis?
  3. Why 0x71, 0x00, 0x7e, 0x00, 0x02 will be assigned to field payload of ElvisStealer?

Solution

  • I think it helps if you understand how Java Serialization works.

    Basically, if an object is serialized the stored data is:

    • a class description (name of the class, serialversion, some flags and information about the fields)
    • the values of the fields (instance fields only and no transient fields)

    For objects that have been written previously the serialization stores a code value that means "use that previous instance" (in Java Serialization terms: a back reference).

    If the object is read back, the process is:

    • read the class description
    • create an instance of the class
    • read the values for the fields
    • if a readResolve() method exists call that method and use the result of that method as result

    With the payload from serializedForm the complete process is:

    • read the class description of Elvis
    • create an Elvis instance
    • read the value for the field favoriteSongs:
      • read the class description of ElvisStealer
      • create an ElvisStealer instance
      • read the value for the field payload
        • the value is a reference to the Elvis instance created in step 2
        • this is the ominous 0x71 0x00 0x7e 0x00 0x02 at the end of the serialized data
      • execute the readResolve() method of ElvisStealer which saves the payload into ElvisStealer.impersonator and then returns new String[] { "A Fool Such as I" }
    • this new String[] { "A Fool Such as I" } is stored as favoriteSongs of the Elvis instance created in step 2
    • execute the readResolve() method of Elvis which returns Elvis.INSTANCE as result
    • that Elvis.INSTANCE is the final result of the deserialize() call.

    Why is this 0x71 0x00 0x7e 0x00 0x02 the reference to the Elvis instance created in step 2?

    The breakdown of 0x71 0x00 0x7e 0x00 0x02 is:

    The java serialization mechanism stores "interesting entries" in the data stream into a HandleTable handles so that later on - if the same "interesting entry" appears again - it can store just a short back reference to that "interesting entry" instead of writing the whole thing again.

    In your example the "interesting entries" are:

    • index 0: the class descriptor for the Elvis class (class name, serialVersionUID, flags, information about the fields)
    • index 1: the string Ljava/lang/Object; - the binary name of the class java.lang.Object which is the class type of the favoriteSongs field in the serialized data
    • index 2: the Elvis instance
    • index 3: the class descriptor for the ElvisStealer class
    • index 4: the string LElvis; - the binary name of the class Elvis which is the class type of the payload field in the serialized data
    • index 5: the ElvisStealer instance

    This table is not predefined somewhere in the serialized data - it is reconstructed during the deserialization process.

    There is no simple rule "the first object is always the number 2" - that very much depends on how many serializable super classes that class of that first object has and how many fields the class and its serializable super classes have.