Search code examples
apache-sparksparkr

how to list spark-packages added to the spark context?


Is it possible to list what spark packages have been added to the spark session?

The class org.apache.spark.deploySparkSubmitArguments has a variable for the packages:

var packages: String = null

Assuming this is a list of the spark packages, is this available via SparkContext or somewhere else?


Solution

  • I use the following method to retrieve that information: spark.sparkContext.listJars

    For example:
    $ spark-shell --packages elsevierlabs-os:spark-xml-utils:1.4.0

    scala> spark.sparkContext.listJars.foreach(println)
    spark://192.168.0.255:51167/jars/elsevierlabs-os_spark-xml-utils-1.4.0.jar
    spark://192.168.0.255:51167/jars/commons-io_commons-io-2.4.jar
    spark://192.168.0.255:51167/jars/commons-logging_commons-logging-1.2.jar
    spark://192.168.0.255:51167/jars/org.apache.commons_commons-lang3-3.4.jar
    spark://192.168.0.255:51167/jars/net.sf.saxon_Saxon-HE-9.6.0-7.jar
    

    In this case, I loaded the spark-xml-utils package, and the other jars were loaded as dependencies.