Lets say we have spark cluster with below configuration.
- 1 Driver and 1000 workers
- Each worker has 5 executer
- Each executer has 4 core
Ho does coalesce/repartition works, when we perform coalesce (50)
or repartition(50)
- does Spark creates 50 partition each on a separate worker ie. it uses 50 workers
or
- many partition can be in single worker for ex: In above case since there are 5 executor/worker, so each worker will have 5 partition ie 1 partition/executor, so spark uses only 10 workers?