Search code examples
google-cloud-platformgoogle-compute-enginestackdrivergoogle-cloud-stackdrivergoogle-container-optimized-os

How do I get startup-script logs from Container-optimized OS in a GCE instance?


I'm running a container-optimized compute instance with this startup-script:

#!/bin/bash

mkdir /home/my-app
cd /home/my-app
export HOME=/home/my-app

docker-credential-gcr configure-docker


docker run --rm --log-driver=gcplogs --name my-app --security-opt seccomp=./config.json gcr.io/my-project/my-app:latest

The --log-driver and --name flags are set according to GCP community guide and docker docs.

Yet I see no logs from the container boot up.

Also, when I'm SSHing into the instance and running command logger "hello from logger", I don't see it showing up in cloud logger. I've tried converting it to advanced filters and removing all filtering except "hello from logger" string filter.

How do I properly setup the logging? I'm using bunyan inside my NodeJS app, but when it the app fails I have absolutely no visibility? I'd love to have all the logs from journalctl in cloud logger. Or, at least the startup-script part of journalctl. Right now I'm retrieving them by SSHing into the instance and running journalctl -r | grep startup-script.

Update

Access scopes are correctly set:

Stackdriver Logging API: Write Only
Stackdriver Logging API: Write Only

I'm using a default compute engine service account. Here the command that I'm creating this VM with:

gcloud compute instance-templates create $APP_ID-template \
    --scopes=bigquery,default,compute-rw,storage-rw \
    --image-project=cos-cloud \
    --image-family=cos-77-lts \
    --machine-type=e2-medium \
    --metadata-from-file=startup-script=./start.sh \
    --tags=http-server,https-server

gcloud compute instance-groups managed create $APP_ID-group \
    --size=1 \
    --template=$APP_ID-template

Startup-script:

#!/bin/bash

mkdir /home/startDir
cd /home/startDir
export HOME=/home/startDir

docker-credential-gcr configure-docker

docker run --log-driver=gcplogs --name my-app --security-opt seccomp=./config.json gcr.io/project-id/app:latest

This VM running a NodeJS script. I'm not providing JSON keys to my NodeJS script. The bunyan logger is correctly sending logs to the cloud logger. It only fails to send logs when server completely crashes.

Logging API is enabled. I'm getting this:

● stackdriver-logging.service - Fluentd container for Stackdriver Logging
   Loaded: loaded (/usr/lib/systemd/system/stackdriver-logging.service; static; vendor preset: disabled)
   Active: inactive (dead)

When running sudo systemctl status stackdriver-logging command in a VM


Solution

  • Google Compute Engine Container-Optimize OS has Operations Logging (formerly Stackdriver) enabled by default.

    In my list of problems and solutions, Problem #3 is the most common in my experience.

    Possible Problem #1:

    By default, new instances have the following scopes enabled:

    • Stackdriver Logging API: Write Only
    • Stackdriver Monitoring API: Write Only

    If you have modified the instance's Access Scopes, make sure that the Stackdriver scopes are enabled. This requires stopping the instance to modify scopes.

    Possible Problem #2:

    If you are using a custom service account for this instance, make sure the service account has at least the role roles/logging.logWriter. Without this role or similar, the logger will fail.

    Possible Problem #3:

    A common problem is the Project Owner did not enable the `Cloud Logging API". Without enabling this API, the instance logger will fail.

    To verify if the logger within the instance is failing, SSH into the instance and execute this command:

    sudo systemctl status stackdriver-logging
    

    If you see error messages related to the logging API, then enable the Cloud Logging API.

    Enable the Cloud Logging API via the CLI:

    gcloud services enable logging.googleapis.com --project=<PROJECT_ID>
    

    Or via the Google Cloud Console:

    https://console.cloud.google.com/apis/library/logging.googleapis.com

    Possible Problem #4:

    When creating an instance via the CLI, you need to specify the following command line option otherwise the logging service will not start:

    --metadata=google-logging-enabled=true
    

    [UPDATE 01/22/2021]

    The OP has two problems. 1) Stackdriver service was not running. The above steps solved that problem. 2) The startup script section was not going to Stackdriver.

    The current configuration for Container OS has the log level set too low to send startup-script logs to Stackdriver.

    The log level is set by the file /etc/stackdriver/logging.config.d/fluentd-lakitu.conf.

    Look for the section Collects all journal logs with priority >= warning. The PRIORITY is 0 -> 4. If you add "5" and "6" to the list, then the startup-scripts are logged in Operations Logging.

    You can change the log level but this change does not persist across reboots. I have not found a solution to make changes permanent.