Search code examples
javaparallel-processingapache-flinkflink-streaming

Apache Flink: Environment Parallelism Setting not applied


I'm trying to set an overall parallelism setting in Flink 1.8.3 in Java as per documentation:

StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
env.setParallelism(4);

Aside from that, I have also setParallelism(2) calls on the sink and source.

I also can see in the flink UI that the environment setting is applied (long-running session cluster, job submitted via rest API or Flink UI):

flink job config

but when i have a look at the parallelism the individual stages are running in the Flink UI, they run all with parallelism 1 (aside from source and sink, which are running with the expected parallelism setting):

flink job parallelism UI

I already tried also setting the parallelism setting on the individual operators instead, but it did not change anything. the operators are normal flatmaps and filters.

What is not configured right here to have all operators respect the parallelism setting properly? Can't i assume that setting the environment level parallelism will automatically apply this to all operators? I.e. do I need to watch out for other stuff as well when setting parallelism setting?


Solution

  • I "fixed" it by not trying to change the parallelism setting from inside the flink job code, but by passing a parallelism setting when starting the Flink job. this is not only possible via the CLI, but also via Rest API and the Flink UI. Everything works for us now as expected.