Search code examples
kubernetesredisgoogle-kubernetes-enginepersistent-volumespersistent-volume-claims

Redis on GKE is running out of disk space


I just installed redis (actually a reinstall and upgrade) on GKE via helm. It was a pretty standard install and nothing too out of the norm. Unfortunately my "redis-master" container logs are showing sync errors over and over again:

Info 2022-02-01 12:58:22.733 MST redis1:M 01 Feb 2022 19:58:22.733 * Waiting for end of BGSAVE for SYNC
Info 2022-02-01 12:58:22.733 MST redis 8085:C 01 Feb 2022 19:58:22.733 # Write error saving DB on disk: No space left on device
Info 2022-02-01 12:58:22.830 MST redis 1:M 01 Feb 2022 19:58:22.829 # Background saving error
Info 2022-02-01 12:58:22.830 MST redis 1:M 01 Feb 2022 19:58:22.829 # Connection with replica redis-replicas-0.:6379 lost.
Info 2022-02-01 12:58:22.830 MST redis 1:M 01 Feb 2022 19:58:22.829 # SYNC failed. BGSAVE child returned an error
Info 2022-02-01 12:58:22.830 MST redis 1:M 01 Feb 2022 19:58:22.829 # Connection with replica redis-replicas-1.:6379 lost.
Info 2022-02-01 12:58:22.830 MST redis 1:M 01 Feb 2022 19:58:22.829 # SYNC failed. BGSAVE child returned an error
Info 2022-02-01 12:58:22.832 MST redis 1:M 01 Feb 2022 19:58:22.832 * Replica redis-replicas-0.:6379 asks for synchronization
Info 2022-02-01 12:58:22.832 MST redis 1:M 01 Feb 2022 19:58:22.832 * Full resync requested by replica redis-replicas-0.:6379
Info 2022-02-01 12:58:22.832 MST redis 1:M 01 Feb 2022 19:58:22.832 * Starting BGSAVE for SYNC with target: disk
Info 2022-02-01 12:58:22.833 MST redis 1:M 01 Feb 2022 19:58:22.833 * Background saving started by pid 8086

I then looked at my persistent volume claim specification "redis-data" and it is in the "Pending" Phase and never seems to get out of that phase. If I look at all my PVCs though then they are all bound and appear to be healthy.

Clearly something isn't as healthy as it seems but I am not sure how to diagnose. Any help would be appreciated.


Solution

  • So I was pretty close on the heels of it, in my case when I uninstalled redis it didn't remove the PVC (which makes some sense) and then when I reinstalled it tried to use the same PVC.

    Unfortunately, that pvc had run out of memory.

    I was able to manually delete the PVC's that previously existed (we didn't need to keep the data) and then reinstall redis via helm. At that point, it created new PVC's and worked fine.