I am modifying default properties in Apache spark. I spin the clusters using EMR on AWS. However, I am setting properties, and unsure how to check if my new configurations are replacing the default configurations.
As an example, I want to modify the default serialisation in Spark. Hence, I supply the following configuration when creating the cluster.
"Classification": "spark-defaults",
"Properties": {
"spark.serializer": "org.apache.spark.serializer:KryoSerializer"
When I then check the spark properties through Spark UK on port :4040 or through YARN method, I see the property there as being set. However, it is unclear if Spark is using this property. Is there a way to check?
I ask this as I once misspelt "spark.serializer" but still saw the property set. I would like to have seen an error thrown showing me that an unknown property is trying to be set.
As you already tested if you misspell property name it is accepted but not used. Since a list of possible properties is open and users can set their own properties and error on potentially unused property is not an option.