Search code examples
apache-sparkspark-streamingtimeline

Why spark streaming executors start at different time?


I'm using Spark streaming 1.6 which uses kafka as a source

My input arguments are as follows:

num-executors    5
num-cores        4
batch Interval  10 sec
maxRate         600
blockInterval   350 ms

Why does some of my executors start later than another ??

enter image description here


Solution

  • That's not executors' start time, but tasks' start time.

    This is most likely due to locality scheduling. Spark delayed the start of a task to find the best executor to run that task on. Check the configuration "spark.locality.wait" in Spark's documentation for further details.