Search code examples
google-cloud-platformstackdriver

Google Cloud Container Optimized OS host logs to stackdriver


TL;DR
What is the best practice to send container optimized os host logs (ssh and executed shell commands) to Stackdriver?

Background:
I'm using Googles Container Optimized OS which works great. It's super easy to send the container logs to Stackdriver, but how do I send host logs to Stackdriver?

It's for auditing purposes, I need to log all SSH connections (accepted or denied) and all commands executed via shell. Previously I would simply send the rsyslogd (auth,authpriv) to stackdriver via the stackdriver host logger package.

This is for Container Optimized OS VM:s running in a managed instance group (mig), not in Google Kubernetes Engine.

It might be super obvious, but I can't seem to find any documentation on it.


Solution

  • On high level, that is what you need to do for any GCP COS instance to ship the OS audit logs to Google stackdriver:

    First, you need to enable audit logs on COS using the following command: systemctl start cloud-audit-setup That would allow the audit logs to generated and captured in the compute instance journal, you can use journalctl command to see the outcome

    Second, you need to have Google Stackdriver agent installed on the instance and configured to ship audit logs from instance journal to stack driver. This can be achieved by having a docker container running fluentd-gcp google container image.

    I am sharing the below cloud-init to do the whole job for you. All what you need to do is to have an instance metadata with key "user-data" and value is the below script:

    #cloud-config
    users:
    - name: logger
      uid: 2001
      groups: docker
    
    write_files:
    
    - path: /etc/google-fluentd/fluentd.conf
      permissions: 0644
      owner: root
      content: |
        # This config comes from a heavily trimmed version of the
        # container-engine-customize-fluentd project. The upstream config is here:
        # https://github.com/GoogleCloudPlatform/container-engine-customize-fluentd/blob/6a46d72b29f3d8e8e495713bc3382ce28caf744e/kubernetes/fluentd- 
    configmap.yaml
        <source>
            type systemd
            path /var/log/journal
            pos_file /var/log/gcp-journald.pos
            filters [{ "SYSLOG_IDENTIFIER": "audit" }]  
            tag node-journal
            read_from_head true
        </source>
        <match **>
          @type copy
           <store>
            @type google_cloud
            # Set the buffer type to file to improve the reliability
            # and reduce the memory consumption
            buffer_type file
            buffer_path /var/log/google-fluentd/cos-system.buffer
            # Set queue_full action to block because we want to pause gracefully
            # in case of the off-the-limits load instead of throwing an exception
            buffer_queue_full_action block
            # Set the chunk limit conservatively to avoid exceeding the GCL limit
            # of 10MiB per write request.
            buffer_chunk_limit 2M
            # Cap the combined memory usage of this buffer and the one below to
            # 2MiB/chunk * (6 + 2) chunks = 16 MiB
            buffer_queue_limit 6
            # Never wait more than 5 seconds before flushing logs in the non-error
            # case.
            flush_interval 5s
            # Never wait longer than 30 seconds between retries.
            max_retry_wait 30
            # Disable the limit on the number of retries (retry forever).
            disable_retry_limit
            # Use multiple threads for processing.
            num_threads 2
          </store>
        </match>
    - path: /etc/systemd/system/logger.service
      permissions: 0644
      owner: root
      content: |
        [Unit]
        Description=logging docker container
        Requires=network-online.target
        After=network-online.target
    
        [Service]
        Environment="HOME=/home/logger"
        ExecStartPre=/usr/share/google/dockercfg_update.sh
        ExecStartPre=/bin/mkdir -p /var/log/google-fluentd/
        ExecStartPre=-/usr/bin/docker rm -fv logger
        ExecStart=/usr/bin/docker run --rm -u 0 \
           --name=logger \
           -v /var/log/:/var/log/ \
           -v /var/lib/docker/containers:/var/lib/docker/containers \
           -v /etc/google-fluentd/:/etc/fluent/config.d/ \
           --env='FLUENTD_ARGS=-q' \
           gcr.io/google-containers/fluentd-gcp:2.0.17
        Restart=always
        RestartSec=1
    runcmd:
    - systemctl daemon-reload
    - systemctl start logger.service
    - systemctl start cloud-audit-setup