Search code examples
scalaapache-sparkclienthadoop-yarn

How to get yarn job status from yarn client in scala


I would like to poll submitted spark/yarn jobs status using scala.


Solution

  • Use yarn client:

    Maven dependency:

        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-yarn-client</artifactId>
            <version>2.6.0-cdh5.16.2</version>
        </dependency>
    

    Scala code:

    import org.apache.hadoop.yarn.client.api.YarnClient
    import org.apache.hadoop.yarn.conf.YarnConfiguration
    import org.apache.hadoop.yarn.api.records.ApplicationId
    
    val client  = YarnClient.createYarnClient()
    val Yconf = new YarnConfiguration();
    Yconf.addResource(hdfsCoreSiteXml)
    Yconf.addResource(hdfsHDFSSiteXml)
    Yconf.addResource(hdfsYarnSiteXml)
    client.init(Yconf)
    
    client.start
    
            val app_id = "application_1590803731996_57381"
            val app_id_parts = app_id.split("_")
            val app_time_part = app_id_parts(1).toLong
            val app_attempt_id_part = app_id_parts(2).toInt
    
            val applicationId = ApplicationId.newInstance(app_time_part, app_attempt_id_part)
    
            val applicationReport: ApplicationReport = client.getApplicationReport(applicationId)
    
            val yarnStatus: YarnApplicationState = applicationReport.getYarnApplicationState
    
            println("Yarn Status: "+yarnStatus.name)
    
          //Yarn status name enum values given below          
          /*    NEW,
                NEW_SAVING,
                SUBMITTED,
                ACCEPTED,
                RUNNING,
                FINISHED,
                FAILED,
                KILLED */