Search code examples
pythonwindowsescapingglobbackslash

Why is glob ignoring some directories?


I'm trying to find all *.txt files in a directory with glob(). In some cases, glob.glob('some\path\*.txt') gives an empty string, despite existing files in the given directories. This is especially true, if path is all lower-case or numeric. As a minimal example I have two folders a and A on my C: drive both holding one Test.txt file.

import glob
files1 = glob.glob('C:\a\*.txt')
files2 = glob.glob('C:\A\*.txt')

yields

files1 = []
files2 = ['C:\\A\\Test.txt']

If this is by design, is there any other directory name, that leads to such unexpected behaviour?

(I'm working on win 7, with Python 2.7.10 (32bit))

EDIT: (2019) Added an answer for Python 3 using pathlib.


Solution

  • The problem is that \a has a special meaning in string literals (bell char).

    Just double backslashes when inserting paths in string literals (i.e. use "C:\\a\\*.txt").

    Python is different from C because when you use backslash with a character that doesn't have a special meaning (e.g. "\s") Python keeps both the backslash and the letter (in C instead you would get just the "s").

    This sometimes hides the issue because things just work anyway even with a single backslash (depending on what is the first letter of the directory name) ...