Search code examples
apache-flinkflink-streaming

Flink disableOperatorChaining Performance impact


I need to understand if i disable operator chaining or start a new chain, what will be the impact on the job performance.

i want to disable it just to follow the job in the webUI. So wanna know how it will impact the job performance.


Solution

  • Task chaining/Operator chaining brings one or more tasks into a single thread which reduces the impact of the de/serialization of the records that travel around your streaming flow.

    Example would be better to understand:

    • Let's say you have 2 operators one for mapping other for filtering (map -> filter) and flink brings instances of 2 operators into a single thread.
    • When one record comes to the map instance, after the map function is done, filter function will be directly called(simple method calling) without serialization and deserialization operations.
    • If you disabled chaining, then record could not be passed directly to the other operation. Therefore this would lead to the bad performance impact

    However, sometimes disabled chaining could be a better solution. Not: I have deleted my example for this situation, because @DavidAnderson stated that it was not correct. My basic point was that there could be situation which chaining operation would cause some instances to be idle.