Search code examples
urllibprometheusprometheus-pushgateway

Pushing metrics to Prometheus Pushgateway fails frequently


I am using the Prometheus Client library to push metrics to pushgateway. Frequently I am getting below error while pushing the metrics. How can I find the root cause of this issue?

 push_to_gateway(
  File "/usr/local/lib/python3.8/dist-packages/prometheus_client/exposition.py", line 285, in push_to_gateway
    _use_gateway('PUT', gateway, job, registry, grouping_key, timeout, handler)
  File "/usr/local/lib/python3.8/dist-packages/prometheus_client/exposition.py", line 358, in _use_gateway
    handler(
  File "/usr/local/lib/python3.8/dist-packages/prometheus_client/exposition.py", line 217, in handle
    resp = build_opener(HTTPHandler).open(request, timeout=timeout)
  File "/usr/lib/python3.8/urllib/request.py", line 525, in open
    response = self._open(req, data)
  File "/usr/lib/python3.8/urllib/request.py", line 542, in _open
    result = self._call_chain(self.handle_open, protocol, protocol +
  File "/usr/lib/python3.8/urllib/request.py", line 502, in _call_chain
    result = func(*args)
  File "/usr/lib/python3.8/urllib/request.py", line 1369, in https_open
    return self.do_open(http.client.HTTPSConnection, req,
  File "/usr/lib/python3.8/urllib/request.py", line 1330, in do_open
    r = h.getresponse()
  File "/usr/lib/python3.8/http/client.py", line 1332, in getresponse
    response.begin()
  File "/usr/lib/python3.8/http/client.py", line 303, in begin
    version, status, reason = self._read_status()
  File "/usr/lib/python3.8/http/client.py", line 272, in _read_status
    raise RemoteDisconnected("Remote end closed connection without"
http.client.RemoteDisconnected: Remote end closed connection without respons

Solution

  • We use Kubernetes internally for deploying services, and here I am trying to Push metrics to push gateway using its ingress. Changing it to use Kubernetes service names instead of ingress reduced these errors significantly but this is not a portable solution incase if service is relocated to other cluster. Solution which worked for me is to do retry using python decorator functions