Search code examples
etlapache-nifidataflowhortonworks-dataflow

Difference Between Processor Properties and Flowfile Attributes in Apache NiFi


My current understanding is that NiFi processor properties are specific to that processor. So adding a new property to a processor will only be visible within that processor and not be passed on to later processor blocks?

This is why UpdateAttribute is necessary to add metadata that stays with the flowfile as it traverses through the data flow:

Update Attribute NiFi Processor Block

So what is the value in allowing the user to add custom properties in a processor beyond the ones defined and required for that processor to execute? Is it analogous to creating variables that can then be used in other properties?

Processor Block Properties


Solution

  • A very good question and one that comes to everyone's mind when they start working on building data-flows in NiFi.

    First things first: Properties vs FlowFile Attributes

    As you yourself have mentioned in your question itself, Properties are something that are used to control the behavior of your Processor while Attributes are metadata of your flow-in-action.

    A simple example, lets take GetFile processor. The properties it exposes like Input Directory, File Filter, etc., tell your processor where & how to look for the source data. When the processor successfully finds some source matching your configuration, it initiates the flow, meaning a FlowFile is generated. This FlowFile will carry the content of the source data plus some metadata of the source such as the name of the file, size of the file, last modified time, etc., This metadata can actually help you down the flow with your subsequent processors like checking the file's type and route the FlowFile accordingly. And mind you, the metadata are not fixed; it differs with the different processors.

    There are few core attributes which every processor would add like application.type, filesize, uuid, path, etc.,

    What is purpose of letting users add custom properties when they are not added to the attributes?

    It is a feature that NiFi offers to processors which they can use or ignore. Not all processors allow custom properties to be added. Only selective processors do.

    Let's take InvokeHttp as an example. This processor allows the developer to create custom properties. When a user adds a new custom property, that property is added as a header to the HTTP call which the processor is going to make because the processor is built that way. It looks for any dynamic (custom) properties. If they are present, it will be considered as custom header(s) the user wants to send.

    At least, in this processor's context, it doesn't make sense to capture this header data as a metadata because it may not be useful for the subsequent processors but there are certain other processors that act differently when custom properties are provided, like UpdateAttribute whose sole purpose is add any custom property as an attribute to the incoming FlowFile.