Search code examples
pythonjupytertensorboard

Tensorboard instances listed as running while the actual processes are defunct


In Jupyter notebook: notebook.list() shows:

Known TensorBoard instances:

  • port 6006: logdir /home/ai-mining/AI_Mining/logs/train (started 21:49:59 ago; pid 32470)
  • port 6006: logdir /home/ai-mining/AI_Mining/logs/ (started 1:20:19 ago; pid 34361)

and if do !kill 32470 !kill 34361

/bin/sh: 1: kill: No such process

/bin/sh: 1: kill: No such process

Indeed if I list the processes for tensorboard in the terminal:ps -ax |grep tensorboard

3788 pts/9 S+ 0:00 grep --color=auto tensorboard

there are no such processes to kill. I should mention that the logs in log_dir where the data is stored are empty. Also the only option in this case is to reload %reload_ext tensorboard and not load. This is not working since the actual process is dead.

How do I clean the logs listed by notebook.list() or in any way solve my problem? I cannot connect to a tensorboard now. Thanks in advance.


Solution

  • Here is what solved my problem in the end. If there is a nicer way to solve this I would be interested. The tensorboard keeps track of the process ids (even defunct ones) in the directory /tmp/.tensorboard-info. If this folder is not in the /tmp folder, one can list the temporary location using

    import tempfile
    import os
    import shutil 
    path = os.path.join(tempfile.gettempdir(), ".tensorboard-info") 
    shutil.rmtree(path) ##this removes the folder recursively
    !fuser 6006/tcp -k #clear the port 
    

    I removed the .tensorboard-info folder, cleared the logs-dir folder and restarted the Tensorboard.

    To avoid cleaning this over and over, one can save the log files in separate folders in the logs-dir folder and keep only one instance of the Tensorboard running and just reload it if needed. Create new folders using:

    import datetime
    TensorBoard(histogram_freq=1, log_dir='/home/my_project_directory/logs/'+ datetime.now().strftime("%Y%m%d-%H%M%S"))
    

    Each run will be listed in Tensorboard separately and can be vizualized.