Search code examples
pythonpytorchpycharmcondavirtualenv

Screen freeze when training deep learning model from terminal but not Pycharm


I have an extremely weird issue where if I run pytorch model training from Pycharm, it works fine but when I run the same code on the same environment from terminal, it freezes the screen. All windows become non-interactable. The freeze affects only me, not other users and for them >>top shows that the model is no longer training. The issue is consistent and reproducible across machines, users, and GPU slots.

All dependencies are installed to a conda environment dl_segm_auto. In pycharm I have it selected as the interpreter. Parameters are passed through Run->Edit configuration.

enter image description here

From terminal, I run

conda activate dl_segm_auto
python training.py [parameters]

After the first epoch the entire remote session freezes.

Suggestions greatly appreciated!


Solution

  • The issue was caused by the matplotlib's backend taking over the screen on Linux. Any of the following can solve the problem:

    1. Installing PyQt5 (which changes the python's/environment's default backend)
    2. Running from Pycharm (which uses a backend selector on startup).
    3. Having matplotlib.use('Qt5Agg') (and potentially others) at the start of plotting functions or top-level script.