Send signal to a Kubernetes Pod from GitLab CI job

I am using a self-hosted Kubernetes cluster and I'm not using GitLab's Kubernetes integration. In my GitLab CI job, I'm changing the configuration of a Prometheus deployment in its associated ConfigMap, and I want to make Prometheus be aware of the new config by sending a SIGHUP signal to its process. Here is my job script to update the ConfigMap and send the signal:

for x in *; do kubectl get configmap prometheus-config -o json | jq --arg name "$(echo $x)" --arg value "$(cat $x)" '.data[$name]=$value' | kubectl apply -f -; done;
kubectl exec deployments/prometheus -- /bin/sh -c "/bin/pkill -HUP prometheus"

This approach works fine in my local terminal. After a manual change in ConfigMap and sending the signal by above command, I can see the effect after that in Prometheus.

The problem is that when I put these commands in my GitLab CI job script, it does seem to do nothing at all. The command successfully runs and my CI job is done, but nothing is refreshed in Prometheus.

I wonder if the way GitLab executes its jobs (the non-interactivity of the shell, etc.) causes this behavior, but I have no idea what I can do about it.

I also tried running a dummy kubectl exec in CI to see if it works at all:

kubectl exec deployments/prometheus -- /bin/sh -c "echo hi"

and it prints hi successfully. So, what's the problem with kubectl and GitLab CI when I'm sending a signal through it?

P.S. My approach to keep a Pod living and update it with new configuration instead of just restarting it may seem to be a bad practice, but if I restart the Pod, Prometheus takes 5~10 minutes to read the tsdb again, and I don't want to lose my monitoring system for just a configuration change. So, I'm sticking to sending that signal by now.

Solution

Config Maps do not update immediately. There can be a delay of up to 2 minutes (as of v1.18) for the changes to be reflected inside the Pod.

A common solution is to treat config maps as immutable data, so a new config map must be created which will need a new deployment template, and trigger a pod rollout. A timestamp or version number in the ConfigMap name usually works.

  volumes:
    - name: config-volume
      configMap:
        name: config-20200527-013643

Another solution is to include an annotation in the deployment template with a checksum of the config map data. When the checksum is updated, new pods will be launched with the updated ConfigMap. This is common in helm templates:

annotations:
  checksum/config: {{ include (print $.Template.BasePath "/config.yaml") . | sha256sum }}

In the specific case of prometheus slow start up, triggering a deployment pod rollout is technically a restart so the "outage" depends on whether or not the "readiness" probe for Prometheus meets your expectation of "online".

If you still need to use the SIGHUP, a shell test is able to compare file modification times with -ot and -nt. In a while loop, the job can wait for the configmap file to update:

kubectl exec deployments/prometheus -- /bin/sh -c "touch /tmp/cireload"
# apply config map changes
kubectl exec deployments/prometheus -- /bin/sh -c "while [ /tmp/cireload -ot /path/to/configmap.yaml ]; do sleep 5; echo "waiting for configmap $(date)"; done; /bin/pkill -HUP prometheus"