Search code examples
kuberneteskubernetes-pod

Is there a design pattern to periodically update a database in kubernetes without inconsistency?


I have a simple Node.js API service in a Kubernetes cluster and it's connected to a MongoDB. This app will erase all the data from the DB and fill it up with new data every ten minutes.

In my current setup I have a NodePort service for the API and a ClusterIP service for the DB. I think everything works well as long as there is a service. However, I am afraid that if the number of Node.js pods is not one, but say 10, it will delete and upload the database at 10 different times.

Is there any way to ensure that no matter how many Node.js pods there are only once this database is deleted and uploaded every 10 minutes?


Solution

  • I see two ways but both require some code changes:

    1. Define an environment variable to enable the deletion job and split your deployment in two: one deployment of a single replica with the deletion enabled and one deployment with all the other replicas but with the deletion disabled.

    2. Use a statefulset and run the deletion only on the first pod. You can do this by checking the pod name which will always be the same on each pod, for example "myapp-0" for the first pod.

    Both case solve your problem but are not that elegant. Something more in line with kubernetes design would be to remove the "deletion every 10 minutes" from your code and place the deletion code in a CLI command. Then create a kubernete CronJob that will run this command every 10 minutes. This way you keep a single, "clean" deployment, and you get all the visibility, features and guarantees from kubernetes cronjobs.