Search code examples
apache-storm

Proper way to ack/anchor Storm tuples


I have a bolt that works in small batches of tuples. I essentially have a switch statement that listens for either a stream from a previous bolt, or a tick tuple. It looks something like this:

switch(component) {
    bolt1:
      do some work...
      anchors.add(tuple)
    tick:
      do some work...
      collector.emit(anchors, value)
      collector.ack(tuple)
      anchors.clear()

When I run this, Storm UI shows a very small number of tuples acked from this bolt. Is this the correct way to anchor them or do I need to call collector.ack(tuple) within the bolt switch statement as well? Even though Storm UI counts are weird, the topology is running without any of the tuples timing out.


Solution

  • You should not ack the tuples in the bolt1 case if you want to replay the pending anchors if the worker crashes. You never want to ack tuples more than once either.

    It looks like you're acking the tick tuple and not the anchors in the tick case? You should ack the anchors as well, or the spout will be told they've failed once you hit the topology message timeout.