I'm bringing up a very silly question about Spark as I want to clear my confusion. I'm very new in Spark and still trying to understand how it works internally.
Say, if I have a list of input files(assume 1000) which I want to process or write somewhere and I want to use coalesce to reduce my partition number to 100.
Now I run this job with 12 executors with 5 cores per executor, that means 60 tasks when it runs. Does that mean, each of the task will work on one single partition independently?
Round: 1 12 executors each with 5 cores => 60 tasks process 60 partitions
Round: 2 8 executors each with 5 cores => 40 tasksprocess the rest of the 40 partitions and 4 executors never place a job for the 2nd time
Or all tasks from the same executor will work on the same partition?
Round: 1: 12 executors => process 12 partitions
Round: 2: 12 executors => process 12 partitions
Round: 3: 12 executors => process 12 partitions
....
....
....
Round: 9 (96 partitions already processed): 4 executors => process the remaining 4 partitions
Say, if I have a list of input files(assume 1000) which I want to process or write somewhere and I want to use coalesce to reduce my partition number to 100.
In spark by default number of partitions
= hdfs blocks
, as coalesce(100)
is specified, Spark will divide input data into 100 partitions.
Now I run this job with 12 executors with 5 cores per executor, that means 60 tasks when it runs. Does that mean, each of the tasks will work on one single partition independently?
As you passed might be passed.
--num-executors 12
: Number of executors to launch in an application.
--executor-cores 5
: Number of cores per executor. 1 core = 1 task at a time
So the execution of partitions will go like this.
12 partitions will be processed by 12 executors with 5 tasks(threads) each.
12 partitions will be processed by 12 executors with 5 tasks(threads) each.
.
.
.
4 partitions will be processed by 4 executors with 5 tasks(threads) each.
NOTE: Usually, Some executors may complete assigned work quickly(various parameters like data locality, Network I/O, CPU, etc.). So, it will pick the next partition to process by waiting for a configured amount of scheduling time.