I have a third-party class that I am trying to use in Hadoop, and thus need to make have it implement Writable
. The problem is that the way Hadoop uses Writable
is to create an object o = SomeObject()
, then call o.readFields(in)
to de-serialize, and in my situation I cannot create the empty object:
public abstract class Cube {
protected final int size;
protected Cube(int size) { this.size = size; }
}
Note size
is final
.
public class RealCube {
public Cube(int size) { super(size); }
}
Here RealCube
only has one super constructor to call, and that construtor sets the final
variable in the abstract super class.
public class RealCubeWritable implements Writable {
public void readFields(DataInput in) {
/* yikes! need to set the size */
}
}
When we get down to trying to implement RealCubeWritable
, I cannot have a RealCubeWritable()
constructor, and I cannot know the actual size
until the DataInput
stream is examined.
So it seems like the only way to do this in Hadoop is to use a wrapper. What I am wondering is if there is a way to use a wrapper, but have RealCubeWritable
still behave like RealCube
? I've looked into using Dynamic Proxy classes, but I'm not sure if this will work (or how to actually do it).
Thanks!
If you genuinely have no control over the Cube object then i'm not sure you have many (pleasant) options:
size
relatively small? (i.e. it can only be a limited set / range of values). If so you could create an instance of RealCube for each valid size value, and again, using a custom Serialization implementation, pick the right Cube instance based upon the size read from the input stream