Search code examples
elasticsearchkuberneteslogstashfilebeat

Filebeat initialize failed with 10.96.0.1:443 i/o timeout error


In my k8s cluster, filebeat connection is failing after a node restart. Other k8s nodes work normally.

logs from filebeat pod:

2020-08-30T03:18:58.770Z    ERROR   kubernetes/util.go:90   kubernetes: Querying for pod failed with error: performing request: Get https://10.96.0.1:443/api/v1/namespaces/monitoring/pods/filebeat-gfg5l: dial tcp 10.96.0.1:443: i/o timeout
2020-08-30T03:18:58.770Z    INFO    kubernetes/watcher.go:180   kubernetes: Performing a resource sync for *v1.PodList
2020-08-30T03:19:28.771Z    ERROR   kubernetes/watcher.go:183   kubernetes: Performing a resource sync err performing request: Get https://10.96.0.1:443/api/v1/pods?fieldSelector=spec.nodeName%3Dlocalhost&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout for *v1.PodList
2020-08-30T03:19:28.771Z    INFO    instance/beat.go:357    filebeat stopped.
2020-08-30T03:19:28.771Z    ERROR   instance/beat.go:800    Exiting: error initializing publisher: error initializing processors: performing request: Get https://10.96.0.1:443/api/v1/pods?fieldSelector=spec.nodeName%3Dlocalhost&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
Exiting: error initializing publisher: error initializing processors: performing request: Get https://10.96.0.1:443/api/v1/pods?fieldSelector=spec.nodeName%3Dlocalhost&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

An error occurs and pod restarts are repeated. Also i restarted this node, but it didn't work.

filebeat version is 6.5.2 and deployed using daemonset. Are there any known issues like this?

All pods except filebeat work on that node has no problems.

update:

apiVersion: v1
data:
  filebeat.yml: |-
    filebeat.inputs:
    - type: docker
      multiline.pattern: '^[[:space:]]+'
      multiline.negate: false
      multiline.match: after
      symlinks: true
      cri.parse_flags: true
      containers:
        ids: [""]
        path: "/var/log/containers"
    processors:
    - decode_json_fields:
        fields: ["message"]
        process_array: false
        max_depth: 1
        target: message_json
        overwrite_keys: false
        when:
          contains:
            source: "/var/log/containers/app"
    - add_kubernetes_metadata:
        in_cluster: true
        default_matchers.enabled: false
        matchers:
        - logs_path:
            logs_path: /var/log/containers/
    output:
      logstash:
        hosts:
        - logstash:5044
kind: ConfigMap
metadata:
  creationTimestamp: "2020-01-06T09:31:31Z"
  labels:
    k8s-app: filebeat
  name: filebeat-config
  namespace: monitoring
  resourceVersion: "6797684985"
  selfLink: /api/v1/namespaces/monitoring/configmaps/filebeat-config
  uid: 52d86bbb-3067-11ea-89c6-246e96da5c9c


Solution

  • The add_kubernetes_metadata failed querying https://10.96.0.1:443/api/v1/pods?fieldSelector=spec.nodeName%3Dlocalhost&resourceVersion=0. As it turned out in the discussion above, this could be fixed by a restart of the Beat that resolved the temporary network interface problem.