I'm confused about when it's OK to use Vertex instance variables to maintain state rather than proper Giraph values ala getValue(). An interesting example I found in the source demonstrates both: SimpleTriangleClosingVertex, which has both an instance variable (closeMap) and a custom vertex value (IntArrayListWritable). I'm a little surprised that using an instance variable is legit due to possibly screwing up serialization (?) My question: Is either valid? If so, how do I choose one over the other? Thanks very much.
The Compute classes in Giraph aren’t serialized. Giraph only serializes the value object which you recieve in the compute method in the vertex variable. You can create as many instance variables as you whish in order to make your function definitions easier, since they can access the instance variables and don’t need to get all the parameters passed but always consider the following two things: