Search code examples
pythonpathlib

How to extract the stems out of multiple file paths using pathlib?


I am trying to extract the stem out of multiple file paths using pathlib and failing to do so.

Here is the code I tried:

base_path = Path(__file__).parent
paths = (base_path / "../dictionary/files/").glob('**/*')
files = [x for x in paths if x.is_file()]
for i in range(len(files)):
     stem_name = files.stem[i]

Here is the error:

for i in range(len(files)):
TypeError: object of type 'generator' has no len()

I have text files with names as 1.txt, 2.txt, 3.txt

Expected:

1
2
3

Solution

  • You were close.

    You should be indexing files (which is the list), and then each element of the list (files[i]) would be a <class 'pathlib.PosixPath'> instance, which would have the .stem method.

    for i in range(len(files)):
        stem_name = files[i].stem
    
    (test-py38) gino:Q$ cat test.py
    from pathlib import Path
    
    base_path = Path(__file__).parent
    paths = (base_path / "./files").glob('**/*')
    files = [x for x in paths if x.is_file()]
    for i in range(len(files)):
        stem_name = files[i].stem
        print(stem_name)
    
    (test-py38) gino:Q$ ls files
    1.txt  2.txt  3.txt
    
    (test-py38) gino:Q$ python test.py
    2
    3
    1
    

    I'm not sure about this error though, because it is not reproducible from the posted code:

    for i in range(len(files)):
        TypeError: object of type 'generator' has no len()
    

    This is only reproducible if you either used map to create files or you used a generator expression (files = (...)) instead of a list comprehension (files = [...]). In both cases, you would be calling len on a generator, and that won't work because generators don't support len().

    (test-py38) gino:Q$ cat test.py
    from pathlib import Path
    
    base_path = Path(__file__).parent
    paths = (base_path / "./files").glob('**/*')
    files = (x for x in paths if x.is_file())  # <---- generator expression
    for i in range(len(files)):
        stem_name = files[i].stem
        print(stem_name)
    
    (test-py38) gino:Q$ python test.py
    Traceback (most recent call last):
      File "test.py", line 6, in <module>
        for i in range(len(files)):
    TypeError: object of type 'generator' has no len()
    

    If you need to loop through a generator, don't use indexing.

    files = (x for x in paths if x.is_file())
    for a_file in files:
        stem_name = a_file.stem