Search code examples
pythonapache-sparkkubernetesdynamic-memory-allocationmlrun

Issue with dynamic allocation in PySpark session (under MLRun and in K8s)


I would like to maximize power of Spark cluster in MLRun solution for my calculation and I used this session setting for Spark cluster in MLRun solution (it is under Kubernetes cluster):

spark = SparkSession.builder.appName('Test-Spark') \
    .config("spark.dynamicAllocation.enabled", True) \
    .config("spark.shuffle.service.enabled", True) \
    .config("spark.executor.memory", "12g") \
    .config("spark.executor.cores", "4") \
    .config("spark.dynamicAllocation.enabled", True) \
    .config("spark.dynamicAllocation.minExecutors", 3) \
    .config("spark.dynamicAllocation.maxExecutors", 6) \
    .config("spark.dynamicAllocation.initialExecutors", 5)\
    .getOrCreate()

The issue is, that I cannot utilize all power and in many cases I utilized only 1, 2 or 3 executor with small amount of cores.

Do you know, how to utilize in Spark session higher sources/performance (it seems, that dynamic allocation does not work correctly in MLRun & K8s & Spark)?


Solution

  • I can fully utilize Spar cluster (environment MLRun & K8s & Spark) in case of static parameters in Spark session (Spark session with params 'dynamicAllocation' did not work for me). You can see a few function samples (note: infrastructure in K8s must be higher e.g. 3 executors and 12 cores as total):

    Configuration 3x executors, total 9 cores:

    spark = SparkSession.builder.appName('Test-Spark') \
        .config("spark.executor.memory", "9g") \
        .config("spark.executor.cores", "3") \
        .config('spark.cores.max', 9) \
        .getOrCreate()
    

    enter image description here

    Configuration 2x executors, total 8 cores:

    spark = SparkSession.builder.appName('Test-Spark') \
        .config("spark.executor.memory", "9g") \
        .config("spark.executor.cores", "4") \
        .config('spark.cores.max', 8) \
        .getOrCreate()
    

    enter image description here

    or it is possible to use Spark operator, detail see link