Search code examples
apacheapache-stormsliding-window

TupleWindow Start/End Time in Apache Storm


I have been developing a profilling application works on CDR(Call Detail Record) data in Apache Storm. Application's main purpose is extracting of Caller TotalCallCount and TotalCallDuration during a specified time block(in every window). For profilling I want to use SlidingWindow technique.

To understand you can look at following image SlidingWindow Image

For profilling I need to know when TupleWindow started and ended. I mean what is the timestamp of TupleWindow or timestamp of SlidingWindow for start and end.

Even if I looked up implementation of Storm, I couldn't find anything about that. Could you help me about how can figure it out?

Thank you very much


Solution

  • If you are using a 1.x release of apache Storm, this information is not directly accessible via the TupleWindow. You will have to manually calculate this. E.g.

    public class MyBolt extends BaseWindowedBolt {
      ...
      long slidingInterval;
    
      @Override
      public BaseWindowedBolt withWindow(Duration windowLength, Duration slidingInterval) {
          this.slidingInterval = slidingInterval.value;
          return super.withWindow(windowLength, slidingInterval);
      }
    
    
      public void execute(TupleWindow inputWindow) {
        long now = System.currentTimeMillis();
        long windowEnd = now;
        long windowStart = now - slidingInterval;
        ...
      }
    

    But it may not be pretty straight forward in all cases especially if you are having event time windows.

    In the latest master branch of storm, the TupleWindow has a getTimestamp method which returns the window end timestamp and works for both processing and event time based windows. This will be available in the future release of storm (2.0 release). It could be back ported and made available in future Storm 1.x releases as well.