Search code examples
apache-sparkhadoop-yarn

Is there a way to get the attempt number of a spark job running on yarn


I was wondering if there is a way to programmatically get the attempt number of a spark job running on yarn.

I already tried using SparkListenerApplicationStart with this listener and regisetring it when launching a spark-submit

class Listner extends SparkListener{
  var att = ""
  override def onApplicationStart(applicationStart: SparkListenerApplicationStart): Unit = {
    att = applicationStart.appAttemptId.getOrElse("")
    println(s"--------------------------------------$att------------------------------------------------")
  }

however att is always empty.


Solution

  • I found a solution to my question:

        val yc = YarnClient.createYarnClient()
        yc.init(spark.sparkContext.hadoopConfiguration)
        yc.start()
        val id = ApplicationId.fromString(spark.sparkContext.applicationId)
        val attempts = yc.getApplicationAttempts(id)