Search code examples
google-cloud-platformsshvirtual-machineairflowgoogle-compute-engine

SSHOperator with ComputeEngineSSHHook


I am trying to run a command using ssh in a GCP VM in airflow via the SSHOperator as described here:

ssh_to_vm_task = SSHOperator(
    task_id="ssh_to_vm_task",
    ssh_hook=ComputeEngineSSHHook(
        instance_name=<MYINSTANCE>,
        project_id=<MYPROJECT>,
        zone=<MYZONE>,
        use_oslogin=False,
        use_iap_tunnel=True,
        use_internal_ip=False
        ),
    command="echo test_message",
    dag=dag
)

However, I get a airflow.exceptions.AirflowException: SSH operator error: [Errno 2] No such file or directory: 'gcloud' error.

Docker is installed via docker-compose following these instructions.

Other Airflow GCP operators (such as BigQueryCheckOperator) work correctly. So at first sight it does not seem like a configuration problem.

Could you please help me? Is this a bug?


Solution

  • It seems the issue is that gcloud was not installed in the docker container by default. This has been solved by following instructions in here: it is necessary to add

    RUN echo "deb [signed-by=/usr/share/keyrings/cloud.google.gpg] http://packages.cloud.google.com/apt cloud-sdk main" | tee -a /etc/apt/sources.list.d/google-cloud-sdk.list && curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key --keyring /usr/share/keyrings/cloud.google.gpg  add - && apt-get update -y && apt-get install google-cloud-sdk -y
          
    

    to the dockerfile that is used to install airflow / install dependencies.