Search code examples
apache-sparkdistributed-computingapache-zookeeper

Distributed sum of numbers


I have a set of web apps running on different machines which have a set of metrics. I want a running sum for each of those metrics from each of the machines.These cumulative metrics will be used later. Hence, should be persisted, say, for each day. How would I approach this? This would be easy to do using Accumulator variables in Spark. But Spark can't be installed on these machines.

I think this also should be possible through Zookeeper. But, how?


Solution

  • I went ahead with using this:

    http://netflix.github.io/curator/doc/com/netflix/curator/framework/recipes/atomic/DistributedAtomicLong.html

    Distributed Atomic Long recipe in Netflix curator.