Search code examples
dockeropenshiftopencpu

Openshift command terminated with non-zero exit code: Error executing in Docker Container: 137


I am running an opencpu based image on openshift, every time the pod starts, after just a few seconds, it crashes with the error:

command terminated with non-zero exit code: Error executing in Docker Container: 137

Event tab shows only below three events and terminal logs does not show anything as well.

Back-off restarting the failed container
Pod sandbox changed, it will be killed and re-created.
Killing container with id docker://opencpu-test-temp:Need to kill Pod

I am really not getting any clue on why container gets restarted in every few seconds. This image runs just fine locally.

Does anyone give me a clue on how to debug this issue ?


Solution

  • Error 137 is often memory related in a docker context.

    The actual error is from the process that is isolated in the docker container. It means that the process could not be killed with a SIGKILL. Source

    From bobcares.com:

    Error 137 in Docker denotes that the container was ‘KILL’ed by ‘oom-killer’ (Out of Memory). This happens when there isn’t enough memory in the container for running the process.

    ‘OOM killer’ is a proactive process that jumps in to save the system when its memory level goes too low, by killing the resource-abusive processes to free up memory for the system.

    Try checking your memory config of the container? And available memory on the host that is launching the pod? Is there nothing the the opencpu container log?

    Check the seting rlimit.as in the config file /etc/opencpu/server.conf, inside the image. This limit is the "per request" memory limit for your opencpu instance (I realize that your problem is at startup, so this is perhaps not too likely to be the case).