Search code examples
google-cloud-platformgoogle-cloud-storagegoogle-compute-enginegoogle-container-os

Copy files to Container-Optimised OS from a GCP Storage bucket


How can one download files from a GCP Storage bucket to a Container-Optimised OS (COS) on instance startup?


I know of the following solutions:

Yet all of these have to be done manually and externally after an instance is started.

There is also cloud init, yet I can't find any info on how to copy files from a Storage bucket. Examples seem to be suggesting that it's better to include content of files in the cloud init file directly, which is not something I want to do because security. Is it possible to download files from Storge bucket using cloud init?

I considered using a startup script, yet COS lacks CLI tools such as gcloud or gsutil to be able to run any such commands in a startup script.

I know I could copy the files manually and then save the image as a boot disk, but I'm hoping there are solutions that avoid having to do so.

Most of all, I'm assuming I'm not asking for something impossible, given that COS instance setup allows me to specify Docker volumes that I could mount onto the starting container. This seems to suggest I should be able to have some private files on the instance the moment COS will attempt to run my image on startup. But how?

gcp_volume_mount


Trying to execute a startup-script with a cloud-sdk image and copying files there as suggested by Guillaume didn't work for me for a while, showing this log. Eventually I realised that the cloud-sdk image is 2.41GB when uncompressed and takes over 2 minutes to complete pulling. I tried again with an empty COS instance and the startup script completed successfully, downloading the data from a Storage bucket.

However, a 2.41GB image and over 2 minutes of boot time sound like a bit of an overkill to download a 2KB file. Don't they?

I'm glad to see a working solution to my question (thanks Guillaume!) although I'm still wondering: isn't there a nicer way to do this? I feel that this method is even less tidy than manually putting the files on the COS instance and then creating a machine image to use in the future.


Solution

  • Based on Guillaume's answer I created and published a gsutil wrapper image, available as voyz/gsutil_wrap. This way I am able to run a startup-script with the following command:

    docker run -v /host/path:/container/path \
      --entrypoint gsutil voyz/gsutil_wrap \
      cp gs://bucket/path /container/path
    

    It's essentially a copy of what Guillaume suggested, except it is using an image containing only a minimum setup required to run gsutil. As a result it weighs 0.22GB and pulls within 10-20 seconds on average - as opposed to 2.41GB and over 2 minutes respectively for the google/cloud-sdk image suggested by Guillaume.

    Also, credit to this incredibly useful StackOverflow answer that allows gsutil to use the default service account for authentication.