Search code examples
google-cloud-platformgsutilgoogle-cloud-scheduler

Is there a way to run gsutil as a regular Linux cronjob on GCP?


I have a script that does some stuff with gcloud utils like gsutil and bq, eg:

#!/usr/bin/env bash

bq 'SELECT * FROM myproject.whatever WHERE date > $x' > res.csv
gsutil cp res.csv gs://my_storage/foo.csv

This works on my machine or VM, but I can't guarantee it will always be on, so I'd like to add this as a GCP cronjob/Lambda type of thing. From the docs here, it looks like the Cloud Scheduler can only do HTTP requests, Pub/Sub, or App Engine HTTP, none of which are exactly what I want.

So: is there any way in GCP to automate some gsutil / bq commands, like a cronjob, but without my supplying an always-on machine?


Solution

  • There are likely going to be multiple answers and this is but one.

    For me, I would examine the concept of Google Cloud Run. The idea here is that you get to create a Docker image that is then instantiated, run and cleaned up when called by a REST request. What you put in your docker image is 100% up to you. It could be a simple image with tools like gcloud and gsutil installed with a script to run them with any desired parameters. Your contract with Cloud Run is only that you consume the incoming HTTP request.

    When there are no requests to Cloud Run, there is no charge as nothing is running. You are only billed for the duration that your logic actually executes for.

    I recommend Cloud Run over Cloud Functions as Cloud Run allows you to define the environment in which the commands run ... for example ... availability of gsutil.