Search code examples
apache-sparkdistributed-computingmesos

Kill a single spark task


I have a very long Spark job, from which a small number of tasks are currently stalled. Is there any way to kill those stalled tasks from the driver node?

For permission reasons I can log in, but not kill the jobs on the slave nodes, so I'm looking for a way to do this from the driver node alone. Note, that I don't want to kill the entire Spark job - just one or two stalled tasks.

If it helps, I'm using Mesos and have access to web UI, but that does not contain the option to kill the task.


Solution

  • No, not really.

    You cannot kill an individual spark task manually, however you can use spark speculation to automatically determine which tasks are taking too long and restart them proactively.

    If you want to do that, set spark.speculation to true and [if you dare] modify the spark.speculation.interval, spark.speculation.multiplier, spark.speculation.quantile configuration options.

    Related Docs: http://spark.apache.org/docs/latest/configuration.html#viewing-spark-properties

    Related SO: How to deal with tasks running too long (comparing to others in job) in yarn-client?