I am relatively new on cluster installations for Spark along with Ambari. Recently, I got a task for installing Spark 2.1.0 on a cluster which pre-installed Ambari with Spark 1.6.2 with HDFS & YARN 2.7.3.
My task is to have Spark 2.1.0 installed since it is the newest version with better compacity with RSpark and more. I searched over the internet for couple days, only found some installation guide on either AWS or Spark 2.1.0 alone.
such as following: http://data-flair.training/blogs/install-deploy-run-spark-2-x-multi-node-cluster-step-by-step-guide/ and http://spark.apache.org/docs/latest/building-spark.html.
But none of them mentioning the interference of different versions of Spark. Since I need to keep this cluster running, I would like to know some potential threat for the cluster.
Is there some proper way to do this installation? Thanks a lot!
If you want to have your SPARK2 installation managed by Ambari then SPARK2 must be provisioned by Ambari.
HDP 2.5.3 does NOT support Spark 2.1.0, it does however come with a technical preview of Spark 2.0.0.
Your options are: