Search code examples
google-cloud-platformsshvirtual-machinetpu

GCP TPU ssh isssue


I am working on a TPU created on Google cloud. Today evening when I tried ssh into my TPU machine and I got the following error

XXX@ip-address: Permission denied (publickey).
Retrying: SSH command error: [/usr/bin/ssh] exited with return code [255]

I deleted keys from ~/.ssh/ folders and reran the SSH command. It generates public, private key in .ssh folder and I copied public key in the GCP => VM => Metadata.


Solution

  • I have replicated your issue in my test environment. I could create the TPU VM using the Cloud TPU VM and successfully logged into SSH using the gcloud command in Cloud Shell.

    This error can occur for several reasons. The following are couple of the most common causes of this error:

    1. If You use an SSH key stored in metadata to connect to a VM that has OS Login enabled.

      If OS Login is enabled on your project, your VM doesn't accept SSH keys that are stored in metadata. To resolve this issue, you can try one of the following:

      1. Connect to your VM using the Google Cloud Console or the gcloud command-line tool.

      2. Add your SSH keys to OS Login. For more information, see Adding SSH keys to a user account.

      3. Disable OS Login. For more information, see Disabling OS Login.

    2. The firewall rule allowing SSH is missing or misconfigured.

      By default, Compute Engine VMs allow SSH access on port 22. If the default-allow-ssh rule is missing or misconfigured, you won't be able to connect to VMs. To resolve this issue, Check your firewall rules and re-add or reconfigure default-allow-ssh.

    Refer to this link for more information on troubleshooting SSH.