Search code examples
process-managementapache-nifi

Is there a mechanism for preserving Processor state between calls


Is there a mechanism for preserving/saving Processor state between calls? In particular I want a reliable mechanism to know when my process last ran, even if the processor, or even NiFi itself has been restarted.

(Please don't give answers such as hBase or the file system. I am looking for something provided by NiFi, or that can be built with services provided by NiFi)


Solution

  • There is currently no out of the box functionality that automatically captures the listed information unilaterally throughout the application for all processors.

    There are mechanisms that provide the capability of accomplishing this type of functionality in components via ControllerServices (think of these as components for cross-cutting concerns or aspects) like the DistributedMapCacheServer/Client or DistributedSetCacheServer/Client.

    There are processors that make use of these controller services in manner analogous to your desired feature such as DetectDuplicate or ListHDFS.

    This is where things stand currently. There is work under way for the next release (0.5.0) that brings more framework functionality to accomplish such tasks and its work is outlined in our State Management Feature Proposal.

    If none of these items quite fits your desired functionality or you have some other ideas, I encourage you to share them with the community either via our mailing lists if you want to hash out your ideas and/or JIRA.