Search code examples
javahadoopmapreducequbole

How to kill hadoop job gracefully/intercept `hadoop job -kill`


My Java application runs on mapper and creates child processes using Qubole API. Application stores child qubole queryIDs. I need to intercept kill signal and shutdown child processes before exit. hadoop job -kill jobId and yarn application -kill applicationId commands are killing job in a SIGKILL manner, I do not know how to intercept shutdown. Is it possible to intercept job kill somehow or configure hadoop to give application chance to shutdown gracefully?

Application successfully intercepts shutdown using ShutdownHook when running locally, not in mapper container and able to kill it's child processes.

Please suggest how to intercept shutdown when running in mapper, or maybe I'm doing something wrong?


Solution

  • SIGKILL is unstoppable and no process can catch it. Neither your Java application, neither the JVM itself... It is, in fact, not event sent to the process. Consider it more as a direct order to the kernel to destroy all the process resources without delay.

    From man 7 signal:

    the signals SIGKILL and SIGSTOP cannot be caught, blocked, or ignored.

    This is a native core kernel feature, you cannot bypass it.

    Also note that according to Prabhu (2015-07-15) on how to kill hadoop jobs:

    Use of following command is depreciated

    hadoop job -list
    hadoop job -kill $jobId
    

    consider using

    mapred job -list
    mapred job -kill $jobId
    

    This is verified on the Apache Hadoop - Deprecated API Documentation

    Unfortunately, according to the current mapred Command Documentation it does not appear that you can control the type of signal sent to terminate a job.