Search code examples
google-cloud-dataflowapache-beamapache-beam-internals

Difference between a pane and window apache beam


What's the difference between pane and window? The incoming elements are grouped into windows. Then what does a pane contain?

I took the following code from beam docs

.of(new DoFn<String, String>() {
     public void processElement(@Element String word, PaneInfo paneInfo) {
  }})

Does each element belong to one pane? Or multiple panes? Need a simple analogy to understand pane and window


Solution

  • Windowing strategies partitions data by their event time. One element can belong to multiple windows (sliding windows).

    Pane is fired by triggers for each window. A window can emit multiple panes depending on how many times a trigger is fired. If there is no trigger, it fires only one pane when the window is out of scope.

    Data emitted by each pane then can be aggregated together by the accumulation mode.

    You can think a window as a class, a pane as an instance of that class. An element can belong to one or more windows and is used by windows to emit panes.

    More details can be found in the programming guide in sessions about windows and triggers.

    When you specify a trigger, you must also set the the window’s accumulation mode. When a trigger fires, it emits the current contents of the window as a pane. Since a trigger can fire multiple times, the accumulation mode determines whether the system accumulates the window panes as the trigger fires, or discards them.

    To set a window to accumulate the panes that are produced when the trigger fires, invoke.accumulatingFiredPanes() when you set the trigger. To set a window to discard fired panes, invoke .discardingFiredPanes().