The goal is to disable the multipart upload on Amazon EMR.
The guide says enter classification=core-site,properties=[fs.s3.multipart.uploads.enabled=false]
in Edit Software Settings when creating the EMR cluster.
My questions are:
sparkSession.sparkContext.hadoopConfiguration.set("fs.s3.multipart.uploads.enabled","false")
in the jar to be executed on EMR?Unfortunately, you cannot currently modify configurations on a running EMR cluster, but if it's possible for you to start a new one, you could use the AWS EMR Console to clone your current cluster's configuration then modify the configuration before launching it. (Note: Only the configuration is cloned, not any of the data that may be stored in HDFS or on the cluster instances' local disks.)
However, I believe that what you asked about in your second question will work as intended. Have you tried this and found it not to work?