Help me please, I have a cluster Apache Flink (2 Job Managers, 3 Task Managers), but I don't know which values to set for that parameters in flink-conf.yml:
jobmanager.heap.size
taskmanager.heap.size
taskmanager.numberOfTaskSlots
parallelism.default
Job Manager machine has: 8CPU, 32GB RAM
Task Manager machine has: 8CPU, 32GB RAM
I'll plan to run on this cluster 15..20 Apache Flink Jobs. Due to private policy I can't write here java code, therefore I'll try to say in words.
It is expected that more than 50 million events will come per day. All Jobs will have one data source.
I would consider to use a resource manager to like YARN, Mesos, or Kubernetes in order to have high availability. In a nutshell, this is what they do for you:
When deploying a Flink application, Flink automatically identifies the required resources based on the application’s configured parallelism and requests them from the resource manager. In case of a failure, Flink replaces the failed container by requesting new resources. All communication to submit or control an application happens via REST calls. This eases the integration of Flink in many environments.
in other words, they can offer the resources from the cluster in demand to the link engine. and you will have less trouble to configure the parameters that you are looking for.