Search code examples
pythonpython-2.7google-cloud-platformgcloudgoogle-cloud-shell

Minimizing disk-space of gcloud CLI installation


My server admin restricts the disk-space to about 50 Mb. The default gcloud install (with alpha) on linux takes about 150 Mb. I need to reduce the install size to fit my drive space.

I tried using pyinstaller (https://www.pyinstaller.org/ ) on lib/gcloud.py , since bin/gcloud is a bash script. The resulting executable (in lib/dist) did not work.

I also tried to zip some of the libs (lib/surface, and some others) and added the resulting .zip files to sys.path in lib/gcloud.py. This should allow zipimport to use these zips while conserving disk space.

While this approach reduced the size to below 50 Mb, and works quite well for some gcloud options, it does not work for cloud-shell.

I noticed that there are a lot of .pyc files along with the .py files. For example both gcloud.py and gcloud.pyc are present in lib/. Now this seems like a waste, so I ran python -m compileall . in the root folder followed by find . -iname '*.py' -delete . This also did not work. But it did reduce the disk space below 40 Mb.

I am most interested in using gcloud alpha cloud-shell, and not the other apis. Using the above approach (.zip files appended to sys.path) gives this error with gcloud alpha cloud-shell ssh/scp

ERROR: gcloud crashed (IOError): [Errno 20] Not a directory

A zipfile of a fully functional gcloud installation directory comes to under 20 Mb. So there has got to be a way to fit it in 50 Mb. Any ideas?

UPDATE:

If you comfortable with using the oauth2 workflow, see joffre's answer below.

Personally, I find it quite troublesome to use oauth2. Infact one of the major benefits of the gcloud CLI for me is that once gcloud init is done, all auth problems are solved.

In the byte-compile approach I tried earlier, __init__.py files were also getting removed. *.json files also seem not essential to functionality (they might have help strings though)

python -m compileall .
find .  iname '*.py' -not -iname '__init__.py' -delete
find . -iname '*.json' -delete

This brings down the total install size to 40-45 Mb.

note that it is also possible to do the reverse i.e. delete all *.pyc while keeping all *.py . This will also reduce disk-space, but not by as much (since most *.pyc seem to be smaller than the corresponding *.py files)


Solution

  • You don't need the gcloud CLI to connect to your Cloud Shell.

    If you run gcloud alpha cloud-shell ssh --log-http, you'll see what the tool is actually doing, so you can manually replicate this.

    First, be sure that your SSH public key is in the environment, which can be done through the API (it does not even need to be done from the server you're trying to connect from).

    Then, you have to start the environment, which can be done through this API endpoint, and you have to wait until the returned operation is finished, which can be done through this other API endpoint. Note that this can be done from your environment (which would require oauth authentication), or you can do this from an external service (e.g. program a Cloud Function to start the Cloud Shell environment when you call a specific endpoint).

    Finally, once the environment is started, you need to get the information for connecting to your Cloud Shell instance though this API endpoint (which, again, does not even need to be done from the server you're connecting from), and finally connect through SSH from the server with that information.

    This will limit the tooling required on your server to a simple SSH client (which is likely already pre-installed).

    With the links I provided, you can do all of this manually and check if this properly works. However, this is tedious to do manually, so I would likely create a Cloud Function that makes all the required API calls, and returns the connection information on the body of the request. I might even be lazy enough to have the function return the explicit ssh command that needs to be run, so once I'm connected to the server, I just need to run curl <my_function_URL>|sh and everything will work.

    If you try to do something like this, be extremely sure to verify that this is secure on your setup (so, be sure to not add unneeded keys on your Cloud Shell environment), since I'm just writing this from the top of my head, and having an exposed Cloud Function feels somewhat insecure (anybody calling that Cloud Function would at least know the IP of your Cloud Shell environment). But at least, this is an idea that you could explore.