I have an extremely weird issue where if I run pytorch model training from Pycharm, it works fine but when I run the same code on the same environment from terminal, it freezes the screen. All windows become non-interactable. The freeze affects only me, not other users and for them >>top shows that the model is no longer training. The issue is consistent and reproducible across machines, users, and GPU slots.
All dependencies are installed to a conda environment dl_segm_auto. In pycharm I have it selected as the interpreter. Parameters are passed through Run->Edit configuration.
From terminal, I run
conda activate dl_segm_auto
python training.py [parameters]
After the first epoch the entire remote session freezes.
Suggestions greatly appreciated!
The issue was caused by the matplotlib's backend taking over the screen on Linux. Any of the following can solve the problem: