have a master node with 2 worker node n2-standard-4 cluster. Jupyter notebook was working earlier and stopped working from morning with error message "504. That’s an error"
when we try to launch the notebook from console . Able to ssh to master node.
check the status: All 3 below services are running and green
systemctl status jupyter
systemctl status knox
systemctl status google-dataproc-component-gateway
Restart: tried restart, but no luck.
sudo systemctl restart jupyter
sudo systemctl restart knox
sudo systemctl restart google-dataproc-component-gateway
Looking at Monitoring tab on console, 'yarn memory' has reached bottom 0 and no logs available for metrics (yarn memory, pending memory, yarn node manager, HDFS capacity) after 10am. Rest of the graphs(CPU, network bytes.etc) have data post 10am too.
Update: tried restarting yarn on master node, which didnt help either.
sudo systemctl restart hadoop-yarn-resourcemanager.service
sudo systemctl restart hadoop-hdfs-namenode.service
It looks like this is an internal error to the notebook. It is recommended that you open a case to Google for this inquiry as recommended in this article.
If you see that there is no available yarn memory and other resources, I believe because there are no current active jobs running(possibly due to the error) as this is a part of the api's auto scaling feature. You will only see availability and utilisation of these resources once you run a job.
Helpful links: