Search code examples
apache-sparkspark-jobserver

Spark 2.1.1 with property spark.akka.threads = 12


I'm using Apache Spark 2.1.1 and Spark JobServer Spark 2.0 Preview.

I'm seeing on the spark UI Environment tab that there is a config property spark.akka.threads = 12 but in the documentation of Spark 2.1.1 Configuration this parameter doesn't exist, I have found it in Spark 1.2.1 configuration and by default it have 4 with the following description:

Number of actor threads to use for communication. Can be useful to increase on large clusters when the driver has a lot of CPU cores.

I'm using spark standalone on a single machine which contains the Master and Worker.

Searching for information about it I have found a recommendation (here) who says that shouldn't be greater than 8.

My questions:

If I'm not setting this properties, is the Jobserver setting it? If yes, Why is doing it if this property doesn't appear anymore on the spark official documentation?

What kind of problems could cause this high parameter for a small non clustered spark standalone?


Solution

  • Spark 1.6 and 2.x is not using Akka, that's why it's not listed in documentation and can't be set. For more details, see this Jira and this commit

    Description of that Jira task:

    A lot of Spark user applications are using (or want to use) Akka. Akka as a whole can contribute great architectural simplicity and uniformity. However, because Spark depends on Akka, it is not possible for users to rely on different versions, and we have received many requests in the past asking for help about this specific issue. For example, Spark Streaming might be used as the receiver of Akka messages - but our dependency on Akka requires the upstream Akka actors to also use the identical version of Akka.

    Since our usage of Akka is limited (mainly for RPC and single-threaded event loop), we can replace it with alternative RPC implementations and a common event loop in Spark.

    Akka was replaced by Spark RPC, it uses Netty

    See also: Why Spark 1.6 does not use Akka? - very similar answer, but question was more direct why is not used, not if is used

    You have this property in some configuration file or set with --conf. Every configuration property that is in configuration file or set with --conf will be listed in Spark UI