Search code examples
performanceapache-nifisizing

Apache Nifi slow cluster issue


I am using a Apache nifi for one of my clickstream projects to do some ETL.

I am getting traffic around 300 messages per second currently with the following infra:

  • RAM - 16 GB
  • Swap - 6 GB
  • CPU - 16 cores
  • Disk - 100GB (Persistance not required)
  • Cluster - 6 nodes

The entire cluster UI has become extremely slow with the following issues

  • Processors giving back pressure when some failure happens, which consumes lot of threads
  • Provenance writing becomes very slow
  • Heartbeat across nodes becomes slow Cluster Heart beat

I have the following questions on the setup

  • Is RPG use recommended, as it is a HTTP call, which i using to spread across all the nodes, as there is an existing issue with EMQTT process for consumer group.
  • What is the recommended value of thread count that should be allotted per core?
  • What are the guidelines for infrastructure sizing
  • What are the tuning parameters for a large cluster with high incoming requests and lot of heavy JSON parsing for transformation

Solution

  • A couple of suggestions

    • Yes RPG usage is recommended, at least from what I've experienced, RPG seems to offer better distribution. Take a look at [3] below
    • Some processors are CPU intensive then others so there's no clear cut answer for what value can be set for Concurrent Tasks. This is more of trial and error or testing and fine tuning approach that you'd have to master. One suggestion is, if you set too many Concurrent Tasks for a CPU intensive processor, it will have serious impact on the nodes.
    • Hortonworks have made a detailed guide regarding this. I've provided the link below. [1]

    Some best practices and handy guides: