Search code examples
sshlinux-kernelgoogle-compute-engineapt-getgcloud

cloud VM instance broken packages after updating packages to earlier version


I did a apt-get upgrade because the load times of our production server were about 40 seconds. I don't have a snapshot before nor after the upgrade.(Although there is a snapshot of six months old) Load times improved to 15-ish seconds but our erizo service stopped working. Erizo was also running on that instance. Restarting the services didn't help so I tried upgrading the packages to the previous version (https://askubuntu.com/questions/138284/how-to-downgrade-a-package-via-apt-get), just like it was but on almost every package there was an error: the previous package version did not excist.(which is strange, because I copied the output of dpkg -l)

Only a few of them were successfully downgraded but I got a serious error when upgrading e1fslibs to it's previous version.:The following packages have unmet dependencies: e2fsprogs: PreDepends: e2fslibs

Somehow that messed up initramfs and/or initramfs-tools and now the instance is running but I can't get into it.

  • Connecting to the instance in google cloud platform :Connecting... Could not connect, retrying (1/3).
  • google cloud shell isn't able to gcloud compute ssh : Permission denied (publickey).
  • using gcloud locally also says Permission denied (publickey).

I checked the following:

  • There are project public keys defined; there aren't any instance public keys defined or any other metadata ( Google Cloud SSH Keys )
  • In google cloud platform >> compute engine >> VM instances >> permissions>> I see 'compute' is disabled
  • verify that the daemon is running by navigating to the serial console output page and looking for output lines prefixed with the accounts-from-metadata: string. If you are using a standard image but you do not see these output prefixes in the serial console output, the daemon might be stopped--> I don't see this so I expect it's NOT running.
  • check firewall rules:(gcloud compute firewall-rules list) default-allow-ssh default 0.0.0.0/0 tcp:22 //rule is present

Following packages were upgraded:

  • apt
  • apt-transport-https
  • apt-utils
  • binutils
  • cloud-init
  • cloud-initramfs-growroot
  • cloud-initramfs-rescuevol
  • comerr-dev
  • dosfstools
  • e2fslibs
  • e2fsprogs
  • gce-cloud-config
  • gce-daemon
  • gce-imagebundle
  • gce-startup-scripts
  • google-cloud-sdk
  • landscape-client
  • landscape-common l
  • ibapt-inst1.4 libapt-pkg4.12
  • libcomerr2
  • libss2
  • libudev0 mountall
  • nginx
  • nginx-common
  • nginx-full
  • ntp
  • ntpdate
  • procps
  • python-apt
  • python-apt-common
  • python-lazr.restfulclient
  • udev
  • unattended-upgrades
  • update-manager-core
  • upstart
  • whoopsie
  • x11-utils

This is get from the serial output :: - mountall: Event failed - landscape-client is not configured, please run landscape-config.

What to do next?

  • Apply a startup script to running instance (following this https://cloud.google.com/compute/docs/startupscript) and try to perform Apt-get upgrade ?

  • try to create a new public key (again) in google cloud shell to access the instance?

  • In google cloud shell the first time this file was generated after typing gcloud compute --project "enduring-palace-762" ssh --zone "europe-west1-c" "tta-media-test-2" WARNING: The private SSH key file for Google Compute Engine does not exist.WARNING: You do not have an SSH key for Google Compute Engine.WARNING: [/usr/bin/ssh-keygen] will be executed to generate a key. This tool needs to create the directory /home/developer/.ssh
  • the generated public key was stored in /home/developer/.ssh /google_compute_engine.pub I made a copy of that, prepended the username and added the content of the public key to compute engine >> metadata>>ssh keys. *key is accepted but the username doesn't show like it does with all the other username - key pairs I get Permission denied (publickey) error though when using gcloud compute ssh tta-media-test-2 --zone europe-west1-c
  • When I provide the ssh key file like this gcloud compute ssh tta-media-test-2 --zone europe-west1-c --ssh-key-file=my-ssh-keys_copy.pub (pwd is inside the folder where key file is) WARNING: The public SSH key file for Google Compute Engine does not exist. WARNING: You do not have an SSH key for Google Compute Engine. WARNING: [/usr/bin/ssh-keygen] will be executed to generate a key.

  • I get same result when i generate a new key with ssh-keygen -t rsa -f my-ssh-keys

  • Any other possible solution would be much appreciated.

[update] I am able to ssh the 'broken' instance from local using ssh user@externalIpOfInstance My plan is to bring it to a upgraded stable state, create a snapshot and see from there..

  • sudo apt-get -f install 0 upgraded, 0 newly installed, 0 to remove and 5 not upgraded. 1 not fully installed or removed. After this operation, 0 B of additional disk space will be used. Setting up initramfs-tools (0.99ubuntu13.5) ... update-initramfs: deferring update (trigger activated) Processing triggers for initramfs-tools ... update-initramfs: Generating /boot/initrd.img-3.13.0-79-generic E: /usr/share/initramfs-tools/hooks/fixrtc failed with return 1. update-initramfs: failed for /boot/initrd.img-3.13.0-79-generic with 1. dpkg: error processing initramfs-tools (--configure): subprocess installed post-installation script returned error exit status 1 Errors were encountered while processing: initramfs-tools E: Sub-process /usr/bin/dpkg returned an error code (1)
  • sudo apt-get upgrade Reading package lists... Done Building dependency tree
    Reading state information... Done The following packages have been kept back: google-chrome-stable The following packages will be upgraded: comerr-dev libcomerr2 libss2 unattended-upgrades 4 upgraded, 0 newly installed, 0 to remove and 1 not upgraded. 1 not fully installed or removed. Need to get 0 B/188 kB of archives. After this operation, 4,096 B of additional disk space will be used. Do you want to continue [Y/n]? y Preconfiguring packages ... (Reading database ... 178509 files and directories currently installed.) Preparing to replace comerr-dev 2.1-1.42-1ubuntu2.2 (using .../comerr-dev_2.1-1.42-1ubuntu2.3_amd64.deb) ... Unpacking replacement comerr-dev ... Preparing to replace libcomerr2 1.42-1ubuntu2.2 (using .../libcomerr2_1.42-1ubuntu2.3_amd64.deb) ... Unpacking replacement libcomerr2 ... Preparing to replace libss2 1.42-1ubuntu2.2 (using .../libss2_1.42-1ubuntu2.3_amd64.deb) ... Unpacking replacement libss2 ... Preparing to replace unattended-upgrades 0.76ubuntu1.1 (using .../unattended-upgrades_0.76ubuntu1.2_all.deb) ... Unpacking replacement unattended-upgrades ... Processing triggers for install-info ... Processing triggers for man-db ... Processing triggers for ureadahead ... Setting up initramfs-tools (0.99ubuntu13.5) ... update-initramfs: deferring update (trigger activated) Setting up libcomerr2 (1.42-1ubuntu2.3) ... Setting up comerr-dev (2.1-1.42-1ubuntu2.3) ... Setting up libss2 (1.42-1ubuntu2.3) ... Setting up unattended-upgrades (0.76ubuntu1.2) ... Processing triggers for initramfs-tools ... update-initramfs: Generating /boot/initrd.img-3.13.0-79-generic E: /usr/share/initramfs-tools/hooks/fixrtc failed with return 1. update-initramfs: failed for /boot/initrd.img-3.13.0-79-generic with 1. dpkg: error processing initramfs-tools (--configure): subprocess installed post-installation script returned error exit status 1 No apport report written because MaxReports is reached already Processing triggers for libc-bin ... ldconfig deferred processing now taking place Errors were encountered while processing: initramfs-tools E: Sub-process /usr/bin/dpkg returned an error code (1)
  • sudo apt-get remove initramfs-tools-bin Reading package lists... Done Building dependency tree
    Reading state information... Done Some packages could not be installed. This may mean that you have requested an impossible situation or if you are using the unstable distribution that some required packages have not yet been created or been moved out of Incoming. The following information may help to resolve the situation:

The following packages have unmet dependencies:

  • cron : Depends: adduser but it is not going to be installed
  • procps : Depends: initscripts
  • upstart : Depends: initscripts Depends: mountall Depends: ifupdown (>= 0.6.10ubuntu5)

E: Error, pkgProblemResolver::Resolve generated breaks, this may be caused by held packages.

what to do here?


Solution

  • If you were able to SSH into the instance using a given SSH key before, the most likely reason it would stop working is if you somehow removed that SSH key or if the SSH daemon wasn't running/was otherwise broken. It appears as though in the downgrade you broke this machine.

    Why do you need this particular VM instance? Does it have important data? If so, you can shut it off, mount its disk using a fresh VM instance, and copy that data off.

    If it runs a service, you should probably cut over to a new machine: even if you're able to get into the instance, there's no telling what still works and what doesn't.