Search code examples
azuresshredhatterraformpacker

Unable to SSH onto VM created by Packer image deployed by Terraform


λ packer version Packer v1.3.2

The Packer file:

{
  "builders"                           : [{
    "type"                             : "azure-arm",

    "client_id"                        : "asdf",
    "client_secret"                    : "asdf",
    "tenant_id"                        : "asdf",
    "subscription_id"                  : "asdf",

    "managed_image_resource_group_name": "asdf",
    "managed_image_name"               : "cis-rhel7-l1",

    "os_type"                          : "Linux",
    "image_publisher"                  : "center-for-internet-security-inc",
    "image_offer"                      : "cis-rhel-7-v2-2-0-l1",
    "image_sku"                        : "cis-rhel7-l1",

    "plan_info"                        : {
        "plan_name"                    : "cis-rhel7-l1",
        "plan_product"                 : "cis-rhel-7-v2-2-0-l1",
        "plan_publisher"               : "center-for-internet-security-inc"
    },

    "communicator"                     : "ssh",

    "azure_tags"                       : {
        "docker"                       : "18.09.0"
    },

    "location"                         : "West Europe",
    "vm_size"                          : "Standard_D2_v3"
  }],
  "provisioners"                       : [
        {
            "type"                     : "shell",
            "script"                   : "./cisrhel7-script.sh"
        }
    ]
}

The script it's calling:

DOCKERURL="asdf"

sudo -E sh -c 'echo "asdf/rhel" > /etc/yum/vars/dockerurl'

sudo sh -c 'echo "7" > /etc/yum/vars/dockerosversion'

sudo yum install -y yum-utils device-mapper-persistent-data lvm2

sudo yum-config-manager --enable rhel-7-server-extras-rpm

sudo yum-config-manager --enable rhui-rhel-7-server-rhui-extras-rpms

curl -sSL "asdf/rhel/gpg" -o /tmp/storebits.gpg

sudo rpm --import /tmp/storebits.gpg

sudo -E yum-config-manager --add-repo "asdf/rhel/docker-ee.repo"

sudo yum -y install docker-ee-18.09.0

sudo yum-config-manager --enable docker-ee-stable-18.09

sudo systemctl unmask --now firewalld.service

sudo systemctl enable --now firewalld.service

systemctl status firewalld

list=(
    "22/tcp"
    "80/tcp"
    "179/tcp"
    "443/tcp"
    "2376/tcp"
    "2377/tcp"
    "4789/udp"
    "6443/tcp"
    "6444/tcp"
    "7946/tcp"
    "7946/udp"
    "10250/tcp"
    "12376/tcp"
    "12378/tcp"
    "12379/tcp"
    "12380/tcp"
    "12381/tcp"
    "12382/tcp"
    "12383/tcp"
    "12384/tcp"
    "12385/tcp"
    "12386/tcp"
    "12387/tcp"
    "12388/tcp"
)
for i in "${list[@]}"; do
    sudo firewall-cmd --zone=public --add-port=$i --permanent
done

sudo firewall-cmd --reload

sudo firewall-cmd --list-all

sudo systemctl stop docker

sudo sh -c 'echo "{\"storage-driver\": \"overlay2\"}" > /etc/docker/daemon.json'

CURRENT_USER=$(whoami)

if [ "$CURRENT_USER" != "root" ]
then
        sudo usermod -g docker "$CURRENT_USER"
fi

sudo systemctl start docker

sudo docker info

I then use Terraform to deploy it:

# skipping pre-TF resources...
resource "azurerm_virtual_machine" "main" {
  name                              = "${var.prefix}-vm"
  location                          = "${azurerm_resource_group.main.location}"
  resource_group_name               = "${azurerm_resource_group.main.name}"
  network_interface_ids             = ["${azurerm_network_interface.main.id}"]
  vm_size                           = "Standard_D2_v3"

  delete_os_disk_on_termination     = true

  storage_image_reference {
    id                            = "${data.azurerm_image.custom.id}"
  }

  storage_os_disk {
    name                            = "${var.prefix}-osdisk"
    caching                         = "ReadWrite"
    create_option                   = "FromImage"
    managed_disk_type               = "Standard_LRS"
  }

  os_profile {
    computer_name                   = "${var.prefix}"
    admin_username                  = "rhel76"
  }

  os_profile_linux_config {
    disable_password_authentication = true

    ssh_keys {
      path                          = "/home/rhel76/.ssh/authorized_keys"
      key_data                      = "${file("rhel76.pub")}"
    }
  }

  plan {
      name                          = "cis-rhel7-l1"
      publisher                     = "center-for-internet-security-inc"
      product                       = "cis-rhel-7-v2-2-0-l1"
  }
}

Builds OK, deploys OK, but when I go to connect:

λ ssh -i rhel76 rhel76@some-ip
The authenticity of host 'some-ip (some-ip)' can't be established.
ECDSA key fingerprint is SHA256:some-fingerprint.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'some-ip' (ECDSA) to the list of known hosts.
Authorized uses only. All activity may be monitored and reported.
rhel76@some-ip: Permission denied (publickey,gssapi-keyex,gssapi-with-mic).

I am not sure if this is a Packer or Terraform issue. I have deployed the base image "cis-rhel7-l1" via Terrraform, changing only the image from mine to the base one, left the ssh key part alone, and it worked fine (I was able to SSH OK).

The only way I can connect to my VM is by doing an SSH key reset within Azure. I reset it using the admin_username as rhel76(from the template) and worked fine, checked /home/rhel76/.ssh/* and stuff was there. Obviously, cos I just did a reset. So rebuilt the whole thing again, without any changes, but when I couldn't log in the next time I did an SSH key reset for a random username, asdf, and then had a look at the /home/rhel76 directory and could not find .ssh/ or ./ssh/authorized_keys folder/file, as if it didn't have the rights to create them.

I have frigged around with the script since then, trying to create these folders and CHMOD'ing them, just in case, but that never works, as I get errors during Packer build:

azure-arm: chmod: cannot access ‘/home/rhel76/.ssh/authorized_keys’: Permission denied

Anyone have any ideas?


Solution

  • So it turns out you need to run a 'de-provision' of the Azure Linux agent, and I've done by incorporating the commend in the provisioners section:

      "provisioners"                       : [
            {
                "type"                     : "shell",
                "script"                   : "./cisrhel7-script.sh"
            },
            {
                "type"                     : "shell",
                "inline"                   : [
                    "echo '************ DEPROVISION'",
                    "sudo /usr/sbin/waagent -force -deprovision+user && export HISTSIZE=0 && sync"
                ]
            }
        ]
    }
    

    Taken from: https://learn.microsoft.com/en-us/azure/virtual-machines/linux/build-image-with-packer