Search code examples
dockerelasticsearchdocker-composefluentdefk

Trying to add FluentD to my workflow but it fails to connect


I was looking through github bugs and noticed some similar but different things and I am really really confused as to how to implement FluentD with a Securely enabled Elasticsearch flow.

The Error I keep getting is:

2022-04-08 15:39:03 +0000 [warn]: #0 Could not communicate to Elasticsearch, resetting connection and trying again. Connection refused - connect(2) for 0.0.0.0:9200 (Errno::ECONNREFUSED)

I had tried the host in the fluentd to also be 'elasticsearch' as well, thinking of dns. Same issues.

I will be attaching a bunch of information for users to leverage to build their own containers if they so choose to help debug this issue, BUT most of it is related to the published docker images created under elasticsearch.

for File Creation, I have as follows:

docker-compose.yaml
fluentd/
  Dockerfile
  log/
  conf/
    fluent.conf

For the Dockerfile, the base samples I was using was with the fluent user, but since I needed to be able to access the certs, I set it to root, since cert ownership was for root level users.

FluentDockerfile:

FROM fluent/fluentd:v1.12-debian-1

# Use root account to use apt
USER root

# below RUN includes plugin as examples elasticsearch is not required
# you may customize including plugins as you wish
RUN buildDeps="sudo make gcc g++ libc-dev" \
 && apt-get update \
 && apt-get install -y --no-install-recommends $buildDeps \
 && sudo gem install fluent-plugin-secure-forward fluent-plugin-elasticsearch \
 && sudo gem sources --clear-all \
 && SUDO_FORCE_REMOVE=yes \
    apt-get purge -y --auto-remove \
                  -o APT::AutoRemove::RecommendsImportant=false \
                  $buildDeps \
 && rm -rf /var/lib/apt/lists/* \
 && rm -rf /tmp/* /var/tmp/* /usr/lib/ruby/gems/*/cache/*.gem

COPY fluent.conf /fluentd/etc/fluent.conf
COPY entrypoint.sh /bin/.

# USER fluent
USER 0
ENTRYPOINT ["tini",  "--", "/bin/entrypoint.sh"]
CMD ["fluentd"]

For the Config File I created, which is probably where one of my problem children is, is going to be as follows. Part of me is thinking one of the issues is related to the created certs, but No on really is explaining much about this to me.

# fluentd/conf/fluent.conf
<source>
  @type forward
  port 24224
  bind 0.0.0.0
</source>
<match *.**>
  @type copy
  <store>
    @type elasticsearch
    host elasticsearch
    port 9200
    logstash_format true
    logstash_prefix fluentd
    logstash_dateformat %Y%m%d
    include_tag_key true
    type_name access_log
    tag_key @log_name
    flush_interval 1s
    flush_mode interval
    user elastic
    password foobar
    ssl_verify true
    ca_file /usr/share/fluentd/certs/ca/ca.crt
    client_crt /usr/share/fluentd/certs/fluentd/fluentd.crt
    client_key /usr/share/fluentd/certs/fluentd/fluentd.key
  </store>
  <store>
    @type stdout
  </store>
</match>

Now the docker-compose file which ties everything together. It is pretty much a trimmed down version of what Elastic exposes. They initially did a 3 search cluster with 1 kibana instance, where-as i converted it down to a single-node cluster, with kibana, and a fluentd instance.

version: "3"

services:
  setup:
    image: docker.elastic.co/elasticsearch/elasticsearch:${STACK_VERSION}
    volumes:
      - certs:/usr/share/elasticsearch/config/certs
    user: "0"
    command: >
      bash -c '
        if [ x${ELASTIC_PASSWORD} == x ]; then
          echo "Set the ELASTIC_PASSWORD environment variable in the .env file";
          exit 1;
        elif [ x${KIBANA_PASSWORD} == x ]; then
          echo "Set the KIBANA_PASSWORD environment variable in the .env file";
          exit 1;
        fi;
        if [ ! -f certs/ca.zip ]; then
          echo "Creating CA";
          bin/elasticsearch-certutil ca --silent --pem -out config/certs/ca.zip;
          unzip config/certs/ca.zip -d config/certs;
        fi;
        if [ ! -f certs/certs.zip ]; then
          echo "Creating certs";
          echo -ne \
          "instances:\n"\
          "  - name: elasticsearch\n"\
          "    dns:\n"\
          "      - elasticsearch\n"\
          "      - localhost\n"\
          "    ip:\n"\
          "      - 127.0.0.1\n"\
          "  - name: fluentd\n"\
          "    dns:\n"\
          "      - fluentd\n"\
          "      - localhost\n"\
          "    ip:\n"\
          "      - 127.0.0.1\n"\
          > config/certs/instances.yml;
          bin/elasticsearch-certutil cert --silent --pem -out config/certs/certs.zip --in config/certs/instances.yml --ca-cert config/certs/ca/ca.crt --ca-key config/certs/ca/ca.key;
          unzip config/certs/certs.zip -d config/certs;
        fi;
        echo "Setting file permissions"
        chown -R root:root config/certs;
        find . -type d -exec chmod 750 \{\} \;;
        find . -type f -exec chmod 640 \{\} \;;
        echo "Waiting for Elasticsearch availability";
        until curl -s --cacert config/certs/ca/ca.crt https://elasticsearch:9200 | grep -q "missing authentication credentials"; do sleep 30; done;
        echo "Setting kibana_system password";
        until curl -s -X POST --cacert config/certs/ca/ca.crt -u elastic:${ELASTIC_PASSWORD} -H "Content-Type: application/json" https://elasticsearch:9200/_security/user/kibana_system/_password -d "{\"password\":\"${KIBANA_PASSWORD}\"}" | grep -q "^{}"; do sleep 10; done;
        echo "All done!";
      '
    healthcheck:
      test: ["CMD-SHELL", "[ -f config/certs/elasticsearch/elasticsearch.crt ]"]
      interval: 1s
      timeout: 5s
      retries: 120
  elasticsearch:
    depends_on:
      setup:
        condition: service_healthy
    image: docker.elastic.co/elasticsearch/elasticsearch:${STACK_VERSION}
    volumes:
      - certs:/usr/share/elasticsearch/config/certs
      - esdata01:/usr/share/elasticsearch/data
    ports:
      - ${ES_PORT}:9200
    environment:
      - node.name=elasticsearch
      - cluster.name=${CLUSTER_NAME}
      - discovery.type=single-node
      # IF i want to switch from a single node cluster to a cluster of 2+, remove the above discovery.type, and implement the below.
      # - cluster.initial_master_nodes=elasticsearch,es02,es03
      # - discovery.seed_hosts=es02,es03
      - ELASTIC_PASSWORD=${ELASTIC_PASSWORD}
      - bootstrap.memory_lock=true
      - xpack.security.enabled=true
      - xpack.security.http.ssl.enabled=true
      - xpack.security.http.ssl.key=certs/elasticsearch/elasticsearch.key
      - xpack.security.http.ssl.certificate=certs/elasticsearch/elasticsearch.crt
      - xpack.security.http.ssl.certificate_authorities=certs/ca/ca.crt
      - xpack.security.http.ssl.verification_mode=certificate
      - xpack.security.transport.ssl.enabled=true
      - xpack.security.transport.ssl.key=certs/elasticsearch/elasticsearch.key
      - xpack.security.transport.ssl.certificate=certs/elasticsearch/elasticsearch.crt
      - xpack.security.transport.ssl.certificate_authorities=certs/ca/ca.crt
      - xpack.security.transport.ssl.verification_mode=certificate
      - xpack.license.self_generated.type=${LICENSE}
    mem_limit: ${MEM_LIMIT}
    ulimits:
      memlock:
        soft: -1
        hard: -1
    healthcheck:
      test:
        [
          "CMD-SHELL",
          "curl -s --cacert config/certs/ca/ca.crt https://localhost:9200 | grep -q 'missing authentication credentials'",
        ]
      interval: 10s
      timeout: 10s
      retries: 120
  kibana:
    depends_on:
      elasticsearch:
        condition: service_healthy
    build: fluentd/.
    image: docker.elastic.co/kibana/kibana:${STACK_VERSION}
    volumes:
      - certs:/usr/share/kibana/config/certs
      - kibanadata:/usr/share/kibana/data
    ports:
      - ${KIBANA_PORT}:5601
    environment:
      - SERVERNAME=kibana
      - ELASTICSEARCH_HOSTS=https://elasticsearch:9200
      - ELASTICSEARCH_USERNAME=kibana_system
      - ELASTICSEARCH_PASSWORD=${KIBANA_PASSWORD}
      - ELASTICSEARCH_SSL_CERTIFICATEAUTHORITIES=config/certs/ca/ca.crt
    mem_limit: ${MEM_LIMIT}
    healthcheck:
      test:
        [
          "CMD-SHELL",
          "curl -s -I http://localhost:5601 | grep -q 'HTTP/1.1 302 Found'",
        ]
      interval: 10s
      timeout: 10s
      retries: 120
  fluentd:
    depends_on:
      elasticsearch:
        condition: service_healthy
    build: fluentd/.
    volumes:
      - certs:/usr/share/fluentd/certs
      - ./fluentd/log:/fluentd/log
      - "./fluentd/conf:/fluentd/etc"
    ports:
      - "24224:24224"
      - "24224:24224/udp"

volumes:
  certs:
    driver: local
  esdata01:
    driver: local
  kibanadata:
    driver: local

I am just at a loss as to how to get it functionally talking to the host elastic instant for ingestion. I figured it might be related to certs, but i am really not sure. Even when attempting to remove ALL Security from this, it still fails to connect to elastic, thinking that there might be a deeper underlying problem I am not quite aware of.

I also updated the conf file to see if dns would be able to resolve, so i changed the host from 0.0.0.0 to elasticsearch to localhost, all of which I thought might work, but none of them did. Still returning the ECONNREFUSED error.


Solution

  • I started to curl from within the Fluentd container to see what I could find and if i could sniff out more information.

    apt-get update
    apt-get install -y curl
    curl -cacert /usr/share/fluentd/certs/ca/ca.crt https://elasticsearch:9200
    {"error":{"root_cause":[{"type":"security_exception","reason":"missing authentication credentials for REST request [/]","header":{"WWW-Authenticate":["Basic realm=\"security\" charset=\"UTF-8\"","Bearer realm=\"security\"","ApiKey"]}}],"type":"security_exception","reason":"missing authentication credentials for REST request [/]","header":{"WWW-Authenticate":["Basic realm=\"security\" charset=\"UTF-8\"","Bearer realm=\"security\"","ApiKey"]}},"status":401}
    

    So there does seem to be at least a connectivity response:

    {"error":{"root_cause":[{"type":"security_exception","reason":"missing authentication credentials for REST request [/]","header":{"WWW-Authenticate":["Basic realm="security" charset="UTF-8"","Bearer realm="security"","ApiKey"]}}],"type":"security_exception","reason":"missing authentication credentials for REST request [/]","header":{"WWW-Authenticate":["Basic realm="security" charset="UTF-8"","Bearer realm="security"","ApiKey"]}},"status":401}

    So it made me think that i DID in fact need the user/password for Basic auth to log in, so when adding that to Fluentd's Conf file, it did seem to work. I was thinking that the certs would have been enough for the handshake.

    My logic for some reason was that with the certs, I didnt need to add user/password to the conf. Given that I saw that the there was the sample using the bin/elasticsearch-certutil, that I would be able to specifically pass in a cert generated for FluentD and it would follow the flow. BUT i guess i must have been mistaken.