Search code examples
kubernetesfluentdtd-agent

Dynamic tagging for Fluentd td-agent source plugin


I'm trying to implement a Streaming Sidecar Container logging architecture in Kubernetes using Fluentd.

In a single pod I have:

  • emptyDir Volume (as log storage)
  • Application container
  • Fluent log-forwarder container

Basically, the Application container logs are stored in the shared emptyDir volume. Fluentd log-forwarder container tails this log file in the shared emptyDir volume and forwards it an external log-aggregator.

The Fluentd log-forwarder container uses the following config in td-agent.conf:

<source>
  @type tail
  tag "#{ENV['TAG_VALUE']}"
  path (path to log file in volume)
  pos_file /var/log/td-agent/tmp/access.log.pos
  format json
  time_key time
  time_format %iso8601
  keep_time_key true
</source>

<match *.*>
  @type forward
  @id forward_tail
  heartbeat_type tcp
  <server>
    host (server-host-address)
  </server>
</match>

I'm using an environment variable to set the tag value so I can change it dynamically e.g. when I have to use this container side-by-side with a different Application container, I don't have to modify this config and rebuild this image again.

Now, I set the environment variable value during pod creation in Kubernetes:

    .
    .
    spec:
      containers:
      - name: application-pod
        image: application-image:1.0
        ports:
        - containerPort: 1234
        volumeMounts:
        - name: logvolume
          mountPath: /var/log/app
      - name: log-forwarder
        image: log-forwarder-image:1.0
        env:
        - name: "TAG_VALUE"
          value: "app.service01"
        volumeMounts:
        - name: logvolume
          mountPath: /var/log/app
      volumes:
      - name: logvolume
        emptyDir: {}

After deploying the pod, I found that the tag value in the Fluentd log-forwarder container comes out empty (expected value: "app.service01"). I imagine it's because Fluentd's td-agent initializes first before the TAG_VALUE environment variable gets assigned.

So, the main question is...
How can I dynamically set the td-agent's tag value?

But really, what I'm wondering is:
Is it possible to assign an environment variable before a container's initialization in Kubernetes?


Solution

  • As an answer to your first question (How can I dynamically set the td-agent's tag value?), this seems the best way that you are doing which is defining tag "#{ENV['TAG_VALUE']}" inside fluentd config file.

    For your second question, environment variable is assigned before a container's initialization.

    So it means it should work and I tested with below sample yaml, and it just worked fine.

    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: fluentd-conf
    data:
      fluentd.conf.template: |
        <source>
          @type tail
          tag "#{ENV['TAG_VALUE']}"
          path /var/log/nginx/access.log
          format nginx
        </source>
        <match *.*>
          @type stdout
        </match>
    ---
    apiVersion: v1
    kind: Pod
    metadata:
      name: log-forwarder
      labels:
        purpose: test-fluentd
    spec:
      containers:
        - name: nginx
          image: nginx:latest
          volumeMounts:
            - name: logvolume
              mountPath: /var/log/nginx
        - name: fluentd
          image: fluent/fluentd
          env:
            - name: "TAG_VALUE"
              value: "test.nginx"
            - name: "FLUENTD_CONF"
              value: "fluentd.conf"
          volumeMounts:
            - name: fluentd-conf
              mountPath: /fluentd/etc
            - name: logvolume
              mountPath: /var/log/nginx
      volumes:
        - name: fluentd-conf
          configMap:
            name: fluentd-conf
            items:
              - key: fluentd.conf.template
                path: fluentd.conf
        - name: logvolume
          emptyDir: {}
      restartPolicy: Never
    

    And when I curl nginx pod, I see this output on fluentd containers stdout.

    kubectl logs -f log-forwarder fluentd
    
    2019-03-20 09:50:54.000000000 +0000 test.nginx: {"remote":"10.20.14.1","host":"-","user":"-","method":"GET","path":"/","code":"200","size":"612","referer":"-","agent":"curl/7.60.0","http_x_forwarded_for":"-"}
    2019-03-20 09:50:55.000000000 +0000 test.nginx: {"remote":"10.20.14.1","host":"-","user":"-","method":"GET","path":"/","code":"200","size":"612","referer":"-","agent":"curl/7.60.0","http_x_forwarded_for":"-"}
    2019-03-20 09:50:56.000000000 +0000 test.nginx: {"remote":"10.128.0.26","host":"-","user":"-","method":"GET","path":"/","code":"200","size":"612","referer":"-","agent":"curl/7.60.0","http_x_forwarded_for":"-"}
    

    As you can see, my environment variable TAG_VALUE=test.nginx has applied to log entries.

    I hope it will be useful.