I want to build a centralised history that looks like
timestamp : topology_name : component_name : topology_id : component_id : VM hostname : VM IP : Worker port
What would be the best to go about it in Storm? I can think of
Reporting this from prepare() method of a spout/bolt
This requires you to enforce a certain type of spout and bolt and you need to account for subclasses which don't call super.prepare
, e.g. by making prepare
final
and make it call protected abstract prepare0
to enforce subclass logic there.
Write a custom scheduler that reports the assignments
That's what I'd do since it's more transparent for the spout and bolt registration and can be reused without any restrictions and incompatibilites. It's probably more complex and requires more insight into Storm internals.