Search code examples
google-cloud-dataprocgoogle-cloud-datalab

Can't launch DataLab on Google DataProc after a while


I created a cluster on DataProc with Datalab installed. I used the following commands to access to dataLab:

export ZONE=us-central1-b;export CLUSTER_NAME=test;

gcloud compute ssh ${CLUSTER_NAME}-m --zone=${ZONE} --ssh-flag='-D 10001' --ssh-flag='-N' --ssh-flag='-n'

/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome \
"http://${CLUSTER_NAME}-m:8080" \
--proxy-server='socks5://localhost:10001' \
--host-resolver-rules='MAP * 0.0.0.0 , EXCLUDE localhost' \
--user-data-dir='/tmp'

And it works for a while. I didn't change anything at all, but after like 2-3 hours I ran the same commands above, I can't access to dataLab again, and get the following error:

ERROR: (gcloud.compute.ssh) Instance [test-m] in zone [us-central1-b] has not been allocated an external IP address yet. Try rerunning this command later.

I tried many times later on and can never success from the first error. This happens to every cluster I created (i.e. not able to access dataLab of the cluster after a while). Can anyone please help me with this? Thank you.


Solution

  • Assuming it's not just in a narrow window of time at instance startup where the address hasn't been allocated yet, at runtime you shouldn't have to worry about external IP getting deallocated so it's likely a false error.

    Usually this occurs erroneously when an instance is in a TERMINATED state. This is in contrast to instances where you configure to not use external IP at all where, you'd otherwise get a message like Instance [foo] in zone [bar] does not have an external IP address. This is because in a TERMINATED instance, there is no active VM resource, but the config metadata must still contain a networkInterface config to preserve the full configuration metadata of the instance, and the gcloud compute logic currently assumes that if networkInterfaces.accessConfigs is defined that it is expected to "eventually" have the natIP field.

    Check to make sure someone didn't click STOP on your VM while you were away. Starting the VM back up should get it working again.