Search code examples
pythonpython-3.xdebuggingpycharmpython-3.5

Debugger times out at "Collecting data..."


I am debugging a Python (3.5) program with PyCharm (PyCharm Community Edition 2016.2.2 ; Build #PC-162.1812.1, built on August 16, 2016 ; JRE: 1.8.0_76-release-b216 x86 ; JVM: OpenJDK Server VM by JetBrains s.r.o) on Windows 10.

The problem: when stopped at some breakpoints, the Debugger window is stuck at "Collecting data", which eventually timeout. (with Unable to display frame variables)

The data to be displayed is neither special, nor particularly large. It is somehow available to PyCharm since a conditional break point on some values of the said data works fine (the program breaks) -- it looks like the process to gather it for display only (as opposed to operational purposes) fails.

When I step into a function around the place I have my breakpoint, its data is displayed correctly. When I go up the stack (to the calling function, the one I stepped down from and where I wanted initially to have the breakpoint) - I am stuck with the "Collecting data" timeout again.

There have been numerous issues raised with the same point since at least 2005. Some were fixed, some not. The fixes were usually updates to the latest version (which I have).

Is there a general direction I can go to in order to fix or work around this family of problems?


EDIT: a year later the problem is still there and there is still no reaction from the devs/support after the bug was raised.


EDIT April 2018: It looks like the problem is solved in the 2018.1 version, the following code which was hanging when setting a breakpoint on the print line now works (I can see the variables):

import threading

def worker():
    a = 3
    print('hello')

threading.Thread(target=worker).start()

Solution

  • I think that this is caused by some classes having a default method __str__() that is too verbose. Pycharm calls this method to display the local variables when it hits a breakpoint, and it gets stuck while loading the string. A trick I use to overcome this is manually editing the class that is causing the error and substitute the __str__() method for something less verbose.

    As an example, it happens for pytorch _TensorBase class (and all tensor classes extending it), and can be solved by editing the pytorch source torch/tensor.py, changing the __str__() method as:

    def __str__(self):
            # All strings are unicode in Python 3, while we have to encode unicode
            # strings in Python2. If we can't, let python decide the best
            # characters to replace unicode characters with.
            return str() + ' Use .numpy() to print'
            #if sys.version_info > (3,):
            #    return _tensor_str._str(self)
            #else:
            #    if hasattr(sys.stdout, 'encoding'):
            #        return _tensor_str._str(self).encode(
            #            sys.stdout.encoding or 'UTF-8', 'replace')
            #    else:
            #        return _tensor_str._str(self).encode('UTF-8', 'replace')
    

    Far from optimum, but comes in hand.

    UPDATE: The error seems solved in the last PyCharm version (2018.1), at least for the case that was affecting me.