Search code examples
amazon-web-serviceskubernetesamazon-ekspersistent-volumesheap-dump

Persist heap dump in case of OOM in kubernetes pod?


I need to persist the heap dump when the java process gets OOM and the pod is restarted.

I have following added in the jvm args

-XX:+ExitOnOutOfMemoryError -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/opt/dumps

...and emptydir is mounted on the same path.

But the issue is if the pod gets restarted and if it gets scheduled on a different node, then we are losing the heap dump. How do I persist the heap dump even if the pod is scheduled to a different node?

We are using AWS EKS and we are having more than 1 replica for the pod.

Could anyone help with this, please?


Solution

  • As writing to EFS is too slow in your case, there is another option for AWS EKS - awsElasticBlockStore.

    The contents of an EBS volume are persisted and the volume is unmounted when a pod is removed. This means that an EBS volume can be pre-populated with data, and that data can be shared between pods.

    Note: You must create an EBS volume by using aws ec2 create-volume or the AWS API before you can use it.

    There are some restrictions when using an awsElasticBlockStore volume:

    • the nodes on which pods are running must be AWS EC2 instances
    • those instances need to be in the same region and availability zone as the EBS volume
    • EBS only supports a single EC2 instance mounting a volume

    Check the official k8s documentation page on this topic, please. And How to use persistent storage in EKS.