Hi does anyone know how to create and run an instance from a jupyter notebook and/or a datalab instance in the cloud?
I'm trying to run a large computation and I want to divide the tasks into several VMs in the cloud. Is there anyway I could create and run an instance from my datalab notebook?
For ex: I want to run every loop of a 10 iteration for loop in a different VM. For that I need to create and run a VM from inside my datalab notebook where my code is. Thanks for the help!
Edit: This is an example of a startup script I'm using.
gcloud compute instances create instance11 \
--metadata startup-script=‘#! /bin/bash
sudo apt update
sudo apt-get install python3.6
wget https://bootstrap.pypa.io/get-pip.py
sudo python get-pip.py
pip --version
pip install pandas --user
pip install scipy --user
pip install scikit-learn --user
pip install sklearn —user
pip install matplotlib --user
gsutil cp gs://bucket/datafile /home directory
gsutil cp gs://bucket/pythonfile /homedirectory
' --machine-type n1-standard-32
The commentary on your question is good and I agree with it. With the proper dependencies installed, you can make calls to gcloud
commands to spin up VMs from your Jupyter notebook. For example, to spin up an n1-standard-1 Debian 9 instance in us-east1:
gcloud compute instances create <name> --image-family debian-9
--machine-type=n1-standard-1 --zone=us-east1-b
I was wondering how I'd pass commands to the VM without having to manually SSH into it. I tried using a startup script but it doesn't get executed.
The cloud-native mechanism for doing this would indeed be to use a startup script to ensure your machine builds are reproducible, rather than logging in via SSH and running commands imperatively at a shell.
If you have problems running a startup script, I recommend creating an instance and attempting to run it manually as the root
user. Otherwise, post an example of the script you are using so we can assist further.
Finally, as this commenter noted, you may be attempting to solve the wrong problem by misusing the framework within which you're operating. If this proves challenging, you should consider taking a step back to define a more robust mechanism using the native Google tools and your own code to implement your requirement.