Search code examples
scalaapache-sparkpyspark

How to set hadoop configuration values from pyspark


The Scala version of SparkContext has the property

sc.hadoopConfiguration

I have successfully used that to set Hadoop properties (in Scala)

e.g.

sc.hadoopConfiguration.set("my.mapreduce.setting","someVal")

However the python version of SparkContext lacks that accessor. Is there any way to set Hadoop configuration values into the Hadoop Configuration used by the PySpark context?


Solution

  • sc._jsc.hadoopConfiguration().set('my.mapreduce.setting', 'someVal')
    

    should work