Search code examples
hadoopapache-sparklogginghadoop-yarn

Why does my yarn application not have logs even with logging enabled?


I have enabled logs in the xml file: yarn-site.xml, and I restarted yarn by doing:

sudo service hadoop-yarn-resourcemanager restart
sudo service hadoop-yarn-nodemanager restart

I ran my application, and then I see the applicationID in yarn application -list. So, I do this: yarn logs -applicationId <application ID>, and I get the following:

hdfs://<ip address>/var/log/hadoop-yarn/path/to/application/  does not have any log files

Do I need to change some other configuration? Or am I accessing the logs the wrong way?

Thank you.


Solution

  • yarn application -list
    

    will list only the applications that are either in SUBMITTED, ACCEPTED or RUNNING state.

    Log aggregation collects each container's logs and moves these logs onto the directory configured in yarn.nodemanager.remote-app-log-dir only after the completion of the application. Refer the description of yarn.log-aggregation-enable property here.

    So, the applicationId listed by the command isn't completed yet and the logs are not yet collected. Thus the response when trying to access the logs of a running application

    hdfs://<ip address>/var/log/hadoop-yarn/path/to/application/  does not have any log files
    

    You can try the same command yarn logs -applicationId <application ID> to view the logs once the application has completed.

    To list all the FINISHED applications, use

    yarn application -list -appStates FINISHED
    

    Or to list all the applications

    yarn application -list -appStates ALL