Search code examples
flink-streaming

How can I specify that parts of my flink job run in different taskmanagers


I have a cluster with several taskmanagers. Each taskmanager (1 taskslot per TM) is running a different breed of job.

I have a particular job consisting on stages, which runs in 1 taskmanager (there is no rebalancing, so the graph optimizer merges everything in the same thread) and I want their 3 operators to run in 3 different taskmanagers, how do I setup that?


Solution

  • The mechanism you're looking for is slot sharing groups. This will allow you to force each stage of your pipeline into its own slot.

    Your application might perform better if instead you were to disable operator chaining (env.disableOperatorChaining() will force each pipeline stage into its own thread) and then run this job on a TM that uses 2 or 3 CPU cores per slot. With this configuration you'd be using shared memory for communication between the stages, rather than the network.