I have a Kubernetes cluster that takes jobs for processing. These jobs are defined as follows:
apiVersion: batch/v1
kind: Job
metadata:
name: process-item-014
labels:
jobgroup: JOB_XXX
spec:
template:
metadata:
name: JOB_XXX
labels:
jobgroup: JOB_XXX
spec:
restartPolicy: OnFailure
containers:
- name: worker
image: gcr.io/.../worker
volumeMounts:
- mountPath: /workspace
name: workspace
resources:
limits:
cpu: 500m
memory: 1Gi
requests:
cpu: 500m
memory: 512Mi
volumes:
- name: workspace
hostPath:
path: /tmp/client-workspace
Note that I'm trying to mount a folder in the host into the container (workspace
). Note also the memory limits defined.
On my container, I download a number of files into workspace
, some of them being pretty large (They are downloaded with gsutil from GCS, but don't think that's too important).
When the files I download exceed the memory limits, my code breaks with a "device out of space" error. This doesn't completely make sense, because I'm storing the files into a mount, that is backed by the host's storage, which is more than enough. It's also mentioned in the docs that memory
limit's the ammount of RAM available for the container, not storage. Still, when I set the limit to XGi
, it breaks after XGi
download pretty consistently.
My container is based on ubuntu:14.04, running a shell script with a line like this:
gsutil -m cp -r gs://some/cloud/location/* /workspace/files
What am I doing wrong? Will definitely need to have some limits for my containers, so I can't just drop the limits.
The /tmp
filesystem is often backed by tmpfs, which stores files in memory rather than on disk. My guess is that is the case on your nodes, and the memory is being correctly charged to the container. Can you use an emptydir volume instead?