Search code examples
bashgoogle-cloud-platformstartuptmuxstartupscript

start tmux sessions in googlecloud startup-script


I added a startup-script entry in the metadatas of my google cloud instance as suggested in the doc here the question Google Compute Engine - Start tmux with startup-script didn't work for me. my startup-script code is:

    #! /bin/bash
    tmux start-server
    tmux new -d -s data_vis_pfs 'pachctl mount /var/data_vis/pfs' 
    tmux new -d -s data_vis_server 'cd /var/data_vis/server/ && python ./index.py' 
    tmux new -d -s data_vis_client 'cd /var/data_vis/client/ && npx serve -l 3001 -s build'

I also tried : 

    #! /bin/bash
    tmux start \; \
         new -d -s data_vis_pfs 'pachctl mount /var/data_vis/pfs' \; \
         new -d -s data_vis_server 'cd /var/data_vis/server/ && python ./index.py' \; \
         new -d -s data_vis_client 'cd /var/data_vis/client/ && npx serve -l 3001 -s build'

When I do sudo journalctl -u google-startup-scripts.service; after the machine boots up I get:

    Aug 24 12:20:40 work1-cpu systemd[1]: Starting Google Compute Engine Startup Scripts...
    Aug 24 12:20:42 work1-cpu GCEMetadataScripts[506]: 2021/08/24 12:20:42 GCEMetadataScripts: Starting startup scripts (version 20201214.00).
    Aug 24 12:20:42 work1-cpu GCEMetadataScripts[506]: 2021/08/24 12:20:42 GCEMetadataScripts: Found startup-script in metadata.
    Aug 24 12:20:42 work1-cpu GCEMetadataScripts[506]: 2021/08/24 12:20:42 GCEMetadataScripts: startup-script exit status 0
    Aug 24 12:20:42 work1-cpu GCEMetadataScripts[506]: 2021/08/24 12:20:42 GCEMetadataScripts: Finished running startup scripts.
    Aug 24 12:20:42 work1-cpu systemd[1]: google-startup-scripts.service: Succeeded.
    Aug 24 12:20:42 work1-cpu systemd[1]: Started Google Compute Engine Startup Scripts.

so it's supposed to be a win (status 0)

But my code doesn't seems to be active (the python server is not launched, the front and the pachctl mount neither). A top command doesn't show them too.

I know I am not supposed to see the sessions as it is ran by root and I could fix that through Socket but I don't care for the moment: I just need the code to be launched. Do someone have a clue about what I am missing?


Solution

  • There were various errors. Thanks to Wojtek_B for his detailed answer which led me to the way.

    1 - First problem : dependencies

    I had to install at the start of the script all the needed dependencies, in my case :

    1.1 - system :

    sudo apt update 
    sudo apt install -y tmux pachctl nodejs npm python3-setuptools python3.7-dev 
    

    1.2 - python :

    python3 -m pip install {all packages here....}
    

    the list of packages to install was retrieved thanks to a pip3 list when logged note the python3 -m pip instead of simply pip or pip3. This is used if there is a python 2.x in the machine gcloud use 2.x by default and thus this install doesn't work (event pip3 install). Anyway this python3 -m pip install ... works I would advise that.

    1.3 - node

    npm install -g npx
    

    2 - Tmux :

    tmux start \; \
      new -d -s sleep 'sleep 1'\;  \
      new -d -s data_vis_pfs 'export KUBECONFIG=/var/data_vis/.kub/config && gcloud auth activate-service-account pfsmounter@{PROJECT}.iam.gserviceaccount.com --key-file=/var/data_vis/sa_cred.json &>> /tmp/pfs_log.txt && gcloud container clusters get-credentials {CLUSTER_NAME} --zone={ZONE_NAME} &>> /tmp/pfs_log.txt &&  kubectl config current-context &>> /tmp/pfs_log.txt && pachctl list repo && pachctl mount /var/data_vis/pfs --verbose &>> /tmp/pfs_log.txt' \; \
     new -d -s data_vis_server 'sleep 1 && ls /var/data_vis/pfs/ &>> /tmp/debug1.txt && cd /var/data_vis/server/ && python3 ./index.py &>> /tmp/server_log.txt' \; \
      new -d -s data_vis_client 'cd /var/data_vis/client/ && npx serve -l 3001 -s build &>> /tmp/client_log.txt'
    
    • first session sleep : in my case not usefull but seems to be good practice in order for the script not to close too early
    • second session pachyderm :
      • I had to create a Service Agent (In the Cloud Console, go to the service accounts page or type gcloud service account if dont trust this link) with the following authorizations :(sorry if not exact label I had to translate from my language)
        • Reader of cluster Kubernetes Engine
        • Service agent of Kubernetes Engine
        • Reader Kubernetes Engine
      • note the {CLUSTER_NAME} {ZONE_NAME} (find them through gcloud container clusters list) and {PROJECT} to replace by your own values. I had to manualy do export KUBECONFIG=/var/data_vis/.kub/config otherwise it would fail in the tmux session (although was working in main thread)
    • third session : flask server (python) : nothing special, I made a sleep just in case
    • fourth session : front application : nothing special

    final code :

    sudo apt update 
    sudo apt install -y tmux pachctl nodejs npm python3-setuptools python3.7-dev 
    
    python3 -m pip install adal aiohttp ansiwrap anyio appdirs argcomplete argon2-cffi arrow asn1crypto async-generator async-timeout attrs backcall backports.functools-lru-cache bidict binaryornot black bleach blinker blis bokeh boto boto3 botocore brotlipy bz2file cachetools catalogue certifi cffi chardet charset-normalizer click cloudpickle colorama colorcet confuse cookiecutter cryptography cycler cymem dask datashader datashape debugpy decorator defusedxml distributed docker docker-pycreds entrypoints fastai fastcore fastprogress Flask Flask-Cors Flask-SocketIO fsspec gcsfs gitdb GitPython google-api-core google-api-python-client google-auth google-auth-httplib2 google-auth-oauthlib google-cloud-bigquery google-cloud-bigquery-storage google-cloud-bigtable google-cloud-core google-cloud-dataproc google-cloud-datastore google-cloud-firestore google-cloud-kms google-cloud-language google-cloud-logging google-cloud-monitoring google-cloud-pubsub google-cloud-scheduler google-cloud-spanner google-cloud-speech google-cloud-storage google-cloud-tasks google-cloud-translate google-cloud-videointelligence google-cloud-vision google-crc32c google-resumable-media googleapis-common-protos grpc-google-iam-v1 grpcio grpcio-gcp h11 HeapDict holoviews htmlmin httplib2 idna ImageHash imageio importlib-metadata ipykernel ipython ipython-genutils ipython-sql ipywidgets itsdangerous jedi Jinja2 jinja2-time jmespath joblib json5 jsonschema jupyter-client jupyter-core jupyter-http-over-ws jupyterlab jupyterlab-git jupyterlab-pygments jupyterlab-server jupyterlab-widgets kiwisolver kubernetes llvmlite locket loguru Markdown MarkupSafe matplotlib matplotlib-inline missingno mistune msgpack multidict multimethod multipledispatch murmurhash mypy-extensions nbclient nbconvert nbdime nbformat nest-asyncio networkx  numba numpy oauthlib olefile packaging pandas pandas-profiling pandocfilters panel papermill param parso partd pathspec pathy patsy pexpect phik pickleshare Pillow pip poyo preshed prettytable prometheus-client prompt-toolkit protobuf psutil ptyprocess pyarrow pyasn1 pyasn1-modules pycosat pycparser pyct pydantic Pygments PyJWT pynndescent pyOpenSSL pyparsing pyrsistent PySocks python-dateutil python-engineio python-pachyderm python-slugify python-socketio pytz pyviz-comms PyWavelets PyYAML pyzmq rawpy regex requests requests-oauthlib requests-unixsocket retrying rope rsa ruamel.yaml ruamel.yaml.clib s3transfer scikit-image scikit-learn scipy seaborn Send2Trash setuptools shellingham simple-websocket simplejson six smart-open smmap sniffio sortedcontainers spacy spacy-legacy SQLAlchemy sqlparse srsly statsmodels tangled-up-in-unicode tblib tenacity terminado testpath text-unidecode textwrap3 thinc threadpoolctl tifffile toml tomli toolz torch torchvision tornado tqdm traitlets typed-ast typeguard typer typing-extensions umap-learn umap-learn[plot]  Unidecode uritemplate urllib3 visions wasabi wcwidth webencodings websocket-client Werkzeug wheel whichcraft widgetsnbextension wrapt wsproto xarray yarl zict zipp 
    
    #pip3 list &>> /tmp/debug1.txt
    npm install -g npx
    #nodejs --version &>> /tmp/debug1.txt
    
    tmux start \; \
      new -d -s sleep 'sleep 1'\;  \
      new -d -s data_vis_pfs 'export KUBECONFIG=/var/data_vis/.kub/config && gcloud auth activate-service-account pfsmounter@{PROJECT}.iam.gserviceaccount.com --key-file=/var/data_vis/sa_cred.json &>> /tmp/pfs_log.txt && gcloud container clusters get-credentials {CLUSTER_NAME} --zone={ZONE_NAME} &>> /tmp/pfs_log.txt &&  kubectl config current-context &>> /tmp/pfs_log.txt && pachctl list repo && pachctl mount /var/data_vis/pfs --verbose &>> /tmp/pfs_log.txt' \; \
      new -d -s data_vis_server 'sleep 1 && ls /var/data_vis/pfs/ &>> /tmp/debug1.txt && cd /var/data_vis/server/ && python3 ./index.py &>> /tmp/server_log.txt' \; \
      new -d -s data_vis_client 'cd /var/data_vis/client/ && npx serve -l 3001 -s build &>> /tmp/client_log.txt'