Search code examples
google-cloud-platformvirtual-machinegoogle-compute-engineshutdown-script

Shutdown script not executing on a Google Cloud VM


I am trying to get a shutdown script to execute using a Google Cloud compute VM.

I see this output when running gcloud compute connect-to-serial-port startup-test-v

Apr  8 22:01:25 startup-test-v shutdown-script: INFO Starting shutdown scripts.
Apr  8 22:01:25 startup-test-v shutdown-script: INFO Found shutdown-script in metadata.
Apr  8 22:01:26 startup-test-v shutdown-script: INFO shutdown-script: No change requested; skipping update for [startup-test-v].
Apr  8 22:01:27 startup-test-v shutdown-script: INFO shutdown-script: Return code 0.
Apr  8 22:01:27 startup-test-v shutdown-script: INFO Finished running shutdown scripts.

I create the preemptible instance from the command line and shut it down in the GUI.

gcloud compute instances create $INSTANCE_NAME \
    --zone=$ZONE \
    --image-family=$IMAGE_FAMILY \
    --image-project=deeplearning-platform-release \
    --maintenance-policy=TERMINATE \
    --machine-type=$INSTANCE_TYPE \
    --boot-disk-size=50GB \
    --metadata="install-nvidia-driver=True" \
    --preemptible \
    --scopes="storage-rw,cloud-platform" \
    --metadata-from-file="shutdown-script=gce/shutdown_test.sh"

shutdown_test.sh is simply:

#!/bin/bash
echo "+++ Shutdown test +++"
exit 0

Startup scripts are working as expected. I've tried swapping the --metadata-from-file flag to --metadata-from-file shutdown-script=gce/shutdown_test.sh as well, with no change.

Ideas? It seems GCE is finding the shutdown script, but not executing it.


Solution

  • Turns out that images can overwrite the CLI-defined shutdown-script metadata

    In my case, pytorch-latest-gpu image changes the shutdown-script metadata to point at its own shutdown script. It does this during first startup.

    If you edit that script - defined at /opt/deeplearning/bin/shutdown_script.sh - you can get whatever shutdown behavior you like. Otherwise you can edit the metadata to point at your script. Your shutdown scripts will appear in serial output logs.

    Apr  9 23:26:46 new-test-d shutdown-script: INFO Starting shutdown scripts.
    Apr  9 23:26:47 new-test-d shutdown-script: INFO Found shutdown-script in metadata.
    Apr  9 23:26:47 new-test-d shutdown-script: INFO shutdown-script: ++++++++++++++ Shutdown test +++++++++++++++++
    

    You can prevent the image script from changing shutdown-script by not giving the instance the permission to add metadata via omitting cloud-platform in --scope. Or you can edit the shutdown-script in the GUI after startup. You may also be able to re-edit it back via a startup script.