Search code examples
apache-beam

Signaling graceful shutdown to the pipeline


Im working with beam 2.3.0 at the moment. I've spent two days investigating how to gracefully shut down a pipline with the DirectRunner. Setting blockOnRun to false and calling cancel just kills the pipeline and it is possible to lose data. I'm wondering if draining the pipeline first is possible before killing it like the dataflow runner does.


Solution

  • This feature does not exist yet at the level of the Beam model. The only runner implementing something like this is Dataflow's Drain feature. There is a proposal being discussed about making this a general Beam API.