I've submitted a python job to bluemix spark as a service and it has failed. Unfortunately the logging is insufficient and doesn't give me a clue why it has failed.
How can I increase the log level output?
Output from spark as a service:
==== Failed Status output =====================================================
Getting status
HTTP/1.1 200 OK
Server: nginx/1.8.0
Date: Thu, 12 May 2016 19:09:30 GMT
Content-Type: application/json;charset=utf-8
Content-Length: 850
Connection: keep-alive
{
"action" : "SubmissionStatusResponse",
"driverState" : "ERROR",
"message" : "Exception from the cluster:
org.apache.spark.SparkUserAppException: User application exited with 255
org.apache.spark.deploy.PythonRunner$.main(PythonRunner.scala:88)
org.apache.spark.deploy.PythonRunner.main(PythonRunner.scala)
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:95)
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:55)
java.lang.reflect.Method.invoke(Method.java:507)
org.apache.spark.deploy.ego.EGOClusterDriverWrapper$$anon$3.run(EGOClusterDriverWrapper.scala:430)",
"serverSparkVersion" : "1.6.0",
"submissionId" : "xxxxxx",
"success" : true
}
===============================================================================
I have run the same job successfully against a BigInsights cluster. I also get much more verbose output when running on the biginsights cluster.
There are stderr-%timestamp%
and stdout-%timestamp%
files downloaded from cluster to your local directory where you ran spark-submit.sh
.
Normally you'll find the job problems in those two files.
Reference: http://spark.apache.org/docs/latest/spark-standalone.html#monitoring-and-logging