Spark 1.6 can be configured to use AKKA or Netty for RPC. In case Netty is configured, does that mean that Spark runtime does not employ the actor model for messaging (e.g. between workers and driver blockmanagers) or even in case of netty configuration, a custom simplified actor model is used by relying on Netty.
I think AKKA itself relies on netty and Spark uses only a subset of AKKA. Still, is configuring AKKA is better for scalability (in terms of number of workers) as compared to netty? any suggestion on this particular spark configuration?
adding to @user6910411s pointer which was nicely explained about the design decision.
as explained by link Flexibility and removing dependency on Akka was the design decision..
Question :
I think AKKA itself relies on netty and Spark uses only a subset of AKKA. Still, is configuring AKKA is better for scalability (in terms of number of workers) as compared to netty? any suggestion on this particular spark configuration?
Yes Spark 1.6 can be configured to use AKKA or Netty for RPC.
it can be configured through spark.rpc
i.e val rpcEnvName = conf.get("spark.rpc", "netty")
that means default: netty.
Please see 1.6 code base
Here are more insights , as when to go for what...
Akka and Netty both deals asynchronous processing and message handling, but they work at different levels W.R.T scalablity.
Akka is a higher level framework for building event-driven, scalable, fault-tolerant applications. It focuses on the Actor class for message processing. Actors have a hierarchical arrangement, parent actors are responsible for supervision of their child actors.
Netty also works around messages, but it is a little lower level and deals more with networking. It has NIO at its core. Netty has a lot of features for using various protocols like HTTP, FTP, SSL, etc. Also, you have more fine-grained control over the threading model.
Netty is actually used within Akka w.r.t. distributed actors.
So even though they are both asynchronous & message-oriented, with Akka you are thinking more abstractly in your problem domain, and with Netty you are more focused on the networking implementation.
spark.rpc
flag there I mean val rpcEnvName = conf.get("spark.rpc", "netty")
is not available. in Spark2.0 code see RpcEnv.scala.