Apache Flink local setup, practical differences between standalone JAR vs start-cluster.sh

So the code is the same in both approaches and it's roughly the following:

StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
// prepare the topology...
env.execute();

The scenario is Flink running locally on a single machine.

Standalone JAR

pom.xml (relevant bits):

<dependency>
    <groupId>org.apache.flink</groupId>
    <artifactId>flink-core</artifactId>
    <version>1.9.0</version>
</dependency>
<dependency>
    <groupId>org.apache.flink</groupId>
    <artifactId>flink-streaming-java_2.12</artifactId>
    <version>1.9.0</version>
</dependency>

Run with:

java -cp target/My-0.0.0.jar MainClass

`start-cluster.sh`

pom.xml (relevant bits):

<dependency>
    <groupId>org.apache.flink</groupId>
    <artifactId>flink-core</artifactId>
    <version>1.9.0</version>
    <scope>provided</scope>
</dependency>
<dependency>
    <groupId>org.apache.flink</groupId>
    <artifactId>flink-streaming-java_2.12</artifactId>
    <version>1.9.0</version>
    <scope>provided</scope>
</dependency>

Run with:

/path/to/flink-1.9.1/bin/flink run -c MainClass target/My-0.0.0.jar

This documentation page states:

The LocalExecutionEnvironment is starting the full Flink runtime, including a JobManager and a TaskManager. These include memory management and all the internal algorithms that are executed in the cluster mode.

And makes me think that there is no practical differences between the two, but I can't be sure...

Is there anything else I need to consider? Would there be any differences in terms of performance?

Solution

As you have found in the documentation, there is little to no difference between the two modes.

LocalExecutionEnvironment will create a mini cluster with a locally running job manager, resource manager, and the configured number of task managers (local.number-taskmanager, default is 1).

If you run start-cluster.sh, it will spawn the same things. The number of task managers depend on your conf/slaves though (which contains only localhost by default).

The main difference is that the LocalExecutionEnvironment runs all these services inside the same JVM, as it is primarily meant to be a debug tool that you run from your IDE. In cluster mode, you will end up with two different processes. Performance-wise, I'd expect no visible difference, as the main load is processed on the task managers. Only the coordination messages (as remote procedure calls) should be faster within the same JVM process.

However, be aware of subtle differences in the configuration and most importantly classloading. Since the local mode will have everything in one process, you might see more/other classes than in cluster mode. So before going to production, make sure to test it in a cluster setup. You also have no Web UI in local mode.