Search code examples
apache-sparkhadoop-yarnhortonworks-data-platformapache-zeppelin

Zeppelin persists job in YARN


When I run a Spark job from Zeppelin, the job finishes with success, but it stays in YARN on mode running. The problem is the job is taking a resource in YARN. I think that Zeppelin persists the job in YARN.

How can I resolve this problem?

Thank you


Solution

  • There are two solutions.

    The quick one is to use the "restart interpreter" functionality, which is misnamed, since it merely stops the interpreter. In this case the Spark job in Yarn.

    The elegant one is to configure Zeppelin to use dynamic allocation with Spark. In that case the Yarn application master will continue running, and with it the Spark driver, but all executors (which are the real resource hog) can be freed by Yarn, when they're not in use.