Search code examples
elasticsearchlogstashgrafanafluentd

Grafana with Elasticsearch - Does not show data when setting Group By Average


Using Grafana 7.2 and Elasticsearch 7.5.1.

Everyting is up and running in Openshift. Elasticsearch datasource is correctly configured and a very simple Dashboard created.

From a Springboot service also running in Openshift I am sending logs to Elasticsearch using Fluentd.

The documents stored in Elasticsearch are like this (taken from Grafana "Logs" result panel):

enter image description here

Edited: following @karan shah suggestion, I add the original log that I am sending to Elastichsearch through Fluentd:

{
   "onpay":{
      "traceId":"9999",
      "inout":"OUT",
      "startTime":"2020-10-01T10:13:43.806+0200",
      "finishTime":"2020-10-01T10:13:43.827+0200",
      "executionTime":21.0,
      "entrySize":124.0,
      "exitSize":124.0,
      "differenceSize":0.0,
      "user":"pgallello",
      "methodPath":"http://localhost:8083/api/serviceEntryPoint",
      "errorMessage":null,
      "className":"com.myorganization.mypackage.MyController",
      "methodName":"serviceTemplateEntryPoint"
   }
}

It is a Elasticsearch document with a field "message", which is the one I want to base my dashboard in. Note two things at this point:

  • The field remarked in red: executionTime.
  • The field _source just have an [object Object] value.

Problem 1:

What I need to do (and I am not getting) is the tricky part: I need to get an histogram showing the average of the value of executionTime field per interval.

Following official documentation, particularly this official video from Grafana, I should be able to change de Group By field to Average and to select @value from the field selector. Unfortunately, that @value value does not appear there (May the _source = [object Object] field have something to do?)

enter image description here

Problem 2:

The other doubt is whether the Query field is valid in that format or what is the way to access to the executionTime field, which is inside the message field inside the Elasticsearch document. In a sort of hierarchy message -> onpay -> executionTime.

Fluentd config file:

  <source>
    @type forward
    port 24224
    bind "0.0.0.0"
  </source>
  <filter onpayapp.**>
    @type parser
    key_name "onpayapp"
    reserve_data true
    <parse>
      @type "json"
    </parse>
  </filter>
  <match onpay.**>
    @type copy
    <store>
      @type "elasticsearch"
      host "elasticdb"
      port 9200
      logstash_format true
      logstash_prefix "applogs"
      logstash_dateformat "%Y%m%d"
      include_tag_key true
      type_name "app_log"
      tag_key "@log_name"
      flush_interval 1s
      <parse>
        @type json
      </parse>
      <buffer>
        flush_interval 1s
      </buffer>
    </store>
    <store>
      @type "stdout"
    </store>
  </match>

Solution

  • Currently what you have there is the entire json as a string in the message field. So Elastic would not be able to apply any mathematical operations on that. What you need to do is using fluentd parse the log line as json so in the Elastic document every field (like logger and level) within that json is part of the elastic document. Once you have that Elastic mostly automatically interpret that executionTime is number and make it available for aggregations. After that you will see that field in your Grafana dropdown.

    Here you can understand more on _source field.

    Add your original log line as well to the question, I think it might help understand what you want to ingest so a suggestion can be made on the possible fluentd configuration.

    Updated Answer based on additional information provided

    For simplicity I used docker setup to run and parse the log pattern provided in the question.

    Fluentd Configuration

    I have used HTTP input so it allows me to curl but you can switch back to forwarder. I have removed filter as I am assuming your source is already JSON so you do not need to parse it as JSON. You can add the matching pattern back if you have multiple types of data processed through the pipeline.

     <source>
        @type http
        port 9880
        bind 0.0.0.0
      </source>
      <match *>
        @type copy
        <store>
          @type "elasticsearch"
          host "es01"
          port 9200
          logstash_format true
          logstash_prefix "applogs"
          logstash_dateformat "%Y%m%d"
          include_tag_key true
          type_name "app_log"
          tag_key "@log_name"
          flush_interval 1s
          <parse>
            @type json
          </parse>
          <buffer>
            flush_interval 1s
          </buffer>
        </store>
        <store>
          @type "stdout"
        </store>
      </match>
    

    Fluent Docker Image

    # fluentd/Dockerfile
    FROM fluent/fluentd:v1.11-debian-1
    
    USER root
    
    RUN touch ~/.gemrc
    RUN echo ':ssl_verify_mode: 0' >> ~/.gemrc
    
    RUN buildDeps="sudo make gcc g++ libc-dev" \
     && apt-get update \
     && apt-get install -y --no-install-recommends $buildDeps \
     && sudo gem install fluent-plugin-elasticsearch \
     && sudo gem sources --clear-all \
     && SUDO_FORCE_REMOVE=yes \
        apt-get purge -y --auto-remove \
                      -o APT::AutoRemove::RecommendsImportant=false \
                      $buildDeps \
     && rm -rf /var/lib/apt/lists/* \
     && rm -rf /tmp/* /var/tmp/* /usr/lib/ruby/gems/*/cache/*.gem
    
    
    USER fluent
    

    Docker Compose You can choose to run just one node of elasticsearch. I already had this setup running.

    services:
      es01:
        image: docker.elastic.co/elasticsearch/elasticsearch:7.8.0
        container_name: es01
        environment:
          - node.name=es01
          - cluster.name=es-docker-cluster
          - discovery.seed_hosts=es02,es03
          - cluster.initial_master_nodes=es01,es02,es03
          - bootstrap.memory_lock=true
          - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
        ulimits:
          memlock:
            soft: -1
            hard: -1
        volumes:
          - data01:/usr/share/elasticsearch/data
        ports:
          - 9200:9200
        networks:
          - elastic
        healthcheck:
          interval: 20s
          retries: 10
          test: curl -s http://localhost:9200/_cluster/health | grep -vq '"status":"red"'
    
      es02:
        image: docker.elastic.co/elasticsearch/elasticsearch:7.8.0
        container_name: es02
        environment:
          - node.name=es02
          - cluster.name=es-docker-cluster
          - discovery.seed_hosts=es01,es03
          - cluster.initial_master_nodes=es01,es02,es03
          - bootstrap.memory_lock=true
          - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
        ulimits:
          memlock:
            soft: -1
            hard: -1
        volumes:
          - data02:/usr/share/elasticsearch/data
        ports:
          - 9201:9200
        networks:
          - elastic
        healthcheck:
          interval: 20s
          retries: 10
          test: curl -s http://localhost:9201/_cluster/health | grep -vq '"status":"red"'
    
      es03:
        image: docker.elastic.co/elasticsearch/elasticsearch:7.8.0
        container_name: es03
        environment:
          - node.name=es03
          - cluster.name=es-docker-cluster
          - discovery.seed_hosts=es01,es02
          - cluster.initial_master_nodes=es01,es02,es03
          - bootstrap.memory_lock=true
          - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
        ulimits:
          memlock:
            soft: -1
            hard: -1
        volumes:
          - data03:/usr/share/elasticsearch/data
        ports:
          - 9202:9200
        networks:
          - elastic
        healthcheck:
          interval: 20s
          retries: 10
          test: curl -s http://localhost:9202/_cluster/health | grep -vq '"status":"red"'
    
      kib01:
        image: docker.elastic.co/kibana/kibana:7.8.0
        container_name: kib01
        ports:
          - 5601:5601
        environment:
          ELASTICSEARCH_URL: http://es01:9200
          ELASTICSEARCH_HOSTS: http://es01:9200
        networks:
          - elastic
        healthcheck:
          interval: 10s
          retries: 20
          test: curl --write-out 'HTTP %{http_code}' --fail --silent --output /dev/null http://localhost:5601/api/status
      
      fluentd:
        build: ./fluentd
        volumes:
          - "./fluentd/conf/:/fluentd/etc/:ro"
        networks:
          - elastic
        ports:
          - "9880:9880"
    
    volumes:
      data01:
        driver: local
      data02:
        driver: local
      data03:
        driver: local
    
    networks:
      elastic:
        driver: bridge
    

    Curl for Test

    curl -X POST -d 'json={    "onpay": {        "traceId": "9999",        "inout": "OUT",        "startTime": "2020-10-01T10:13:43.806+0200",        "finishTime": "2020-10-01T10:13:43.827+0200",        "executionTime": 21.0,        "entrySize": 124.0,        "exitSize": 124.0,        "differenceSize": 0.0,        "user": "pgallello",        "methodPath": "http://localhost:8083/api/serviceEntryPoint",        "errorMessage": null,        "className": "com.myorganization.mypackage.MyController",        "methodName": "serviceTemplateEntryPoint"    }}' http://localhost:9880/
    

    Result in Elastic Search ElasticSearch

    Once you get all your json keys ingested like this, Elastic will auto-map most fields and allow search, aggregation etc based on the field type. You can change the field type & formatting from kibana index management if you want.