Search code examples
apache-nifi

How to limit Nifi processor to run on a single node in cluster?


We are building a data workflow with NiFi and want the final (custom) processor (which runs the deduplication logic) to run only one one of the NiFi cluster nodes (instead of running on all of them). I see that NiFi 1.7.0 (which is not yet released) has a PrimaryNodeOnly annotation to enforce a single node execution behaviour. Is there a way or workaround to enforce such behaviour in NiFi 1.6.0?

NOTE: In addition to @PrimaryNodeOnly, it would be better if NiFi provides a way to run a processor on a single node only (i.e., some annotation like @SingleNodeOnly). This way the execution node need not necessarily be the primary node which therefore will reduce the load on primary node. This is just an ask for future and not necessary to solve the problem mentioned above.


Solution

  • There is no specific workaround to enforce it in previous versions, it is on the data flow designer to mark the intended processor(s) to run on the Primary Node only. You could write a script to query the NiFi API for processors of certain types or names, then check/set the strategy as Primary Node Only.