Search code examples
pythonoperating-systemruntime-error

Frozen OS error when walking the file tree of an external drive


I want to normalize filepaths (removing accents) in an external drive and I use os.walk(). At one point, the script freezes and after I cancel, I see this message:

^CTraceback (most recent call last):
  File "~/normalize_filepaths.py", line 2
    for root, dirs, files in os.walk(target, topdown = False):
  File "<frozen os>", line 377, in walk
KeyboardInterrupt

Here is a snippet of the relevant code, the second

def normalize(fp):
    """
    >>> normalize("/Volumes/MM_BUP/MIGUEL/Acólitos")
    '/Volumes/MM_BUP/MIGUEL/Acolitos'
    >>> normalize("/Volumes/MM_BUP/MIGUEL/Acólitos")
    '/Volumes/MM_BUP/MIGUEL/Acolitos'
    >>> normalize("'This is my cup.' _ ゼロコ ZEROKO _ 紅茶の遊び方 _ mime _ clowning-8MgRJAXn1tE.mp4")
    "'This is my cup.' _  ZEROKO _  _ mime _ clowning-8MgRJAXn1tE.mp4"
    >>> normalize(" großer Tag")
    ' grosser Tag'
    """

    fp = fp.replace("ß", "ss")
    
    name_clean = unicodedata.normalize('NFD', fp)
    return name_clean.encode('ascii', 'ignore').decode("ascii")


def main(target="/some/path"):
    for root, dirs, files in os.walk(target, topdown = False):
        for name in files + dirs:
            filepath = os.path.join(root, name)
            clean = normalize(name)
            new_filepath = os.path.join(root, clean)
            shutil.move(filepath, new_filepath)

How can I avoid this frozen OS error and visit all files and directories?


Solution

  • In order to reproduce the problem, I tried:

    1. File names suitable for being modified by normalize (such as the ones documented in that method's comments). ASCII only names were tried too.
    2. Symbolic links, along with normal files (empty and non-empty).

    ...and it worked just fine every time (for some toy example file system hierarchy). So I failed to reproduce, which means I didn't try it in the applicable way.

    Your stack trace though does not indicate (at least as far as I see) an error in the code somewhere necessarily, but rather that you interrupted it at some point (you got a KeyboardInterrupt, which seems to happen upon canceling the code yourself, which is something that your post states you did). So it seems the program was not responding. According to my experience, when a code segment freezes (or seems to freeze), the first possible causes that come to my mind to investigate are:

    1. Deadlocks. This can happen for example when accesing shared resources in a competitive consumers' environment. Assuming the simplest scenario, ie that your code is run on a single thread and process, the only way I can imagine there could be a deadlock in your snippet is inside implementations of the methods you call, ie methods in the imported modules os, shutil, and unicodedata which you call. But these modules, being Python standard library ones, I trust and feel confident that will not have any race conditions, but even if they have, I just don't know how they are implemented internally and cannot investigate them in a timely manner (ie it is expected as a most effort and unprobable success attempt), so I left this scenario to be investigated last, in favor of the simpler following ones.
    2. Infinite loops. Here, it came to my mind that maybe shutil.moveing a file from its original name to a new name, inside the same root path, could result in producing a new result/file from os.walk. But then this file would be again shutil.moved and so on. I tried this with a purposedly alternating file name, but failed to produce an infinite loop, and then realized firstly that this shouldn't be the case in your code, judging from how you implement the new files' name generation (for example normalize(normalize(filepath)) should be the same as normalize(filepath), according to some tests and reading some documentation at least, so no new path would be generated then), and secondly because according to the documentation of os.walk the returned values are of types str and list (which indicates that they are filled beforehand and not while being iterated).
    3. Actual lengthy processing. Here the first thing to blame for me is that the code is dealing with read and write operations from secondary storage media (ie your hard drive). Both os.walk and shutil.move are possible candidates then. The former I assumed (based on past observations) that can easily be a lengthy process in case you are running the code on directories with many files. The latter can obviously be lengthy in case of large files. But since you are so far getting the KeyboardInterrupt inside os.walk and not shutil.move then doing the operation on many files could explain a lengthy os.walk, while these being small could minimize shutil.moveing time at the same time, so I just assumed the first case (os.walking many files) first.

    In order to have progress towards ruling out deadlocks' case, or verifying infinite loop case, or ruling out the case of os.walking many files, the simplest way would be to add some print statements inside the inner loop. Of course you can't detect as easily (with simple print statements) the case of shutil.moveing large files (unless you have access to shutil.move), but I would suggest to start with the easy stuff first and just put some print statements inside the inner loop. As a bonus, with print statements, you can verify generated file paths are legal (as an effort to rule out the possibility of the problem being to normalize an unusual path for example), as well as in the intended format.

    To be honest I didn't test a large number of files to stress the above reasoning though, but since you confirmed in the comments that this is indeed the case and suggested to post this as an answer, then I did so. I am certain there must be more problematic scenarios to think of for freezing code, but I am glad that the actual scenario was found and I contributed for you finding it (hehe).