Search code examples
pythonlistextendwith-statementlistdir

Why are both filenames in directory printed while only content of one file is extended in the list?


I'm new with Python and I'm sure the mistake is obvious for most of you. I try to iterate through a folder using os.listdir(). Only filenames with .out are important. I want to extend the list out = [] by every entry of every *.out file. To check whether my if loop works, I print the filenames (two filenames are printed) but only the content of one file is extended in the list out = [].

out = []

for filename in os.listdir(path):
    if filename.endswith('.out'):
        print(filename)
        with open(filename) as f:
            out.extend(f)

Solution

  • As I said in one of my comments, if you are on Python 3.4+, pathlib will make your life a lot easier.

    To get a list of all file names ending in .out from folder folder, you simply do:

    from pathlib import Path
    
    folder = Path('folder')
    
    outs = [_.name for _ in folder.glob('*.out')]
    

    And that is it.

    If you want to populate a list called lines with all *.out files contents you can simply need to:

    from pathlib import Path
    
    folder = Path('folder')
    
    lines = []
    
    lines.extend([_.read_text().split() for _ in folder.glob('*.out')])
    

    And here is a small proof of concept:

    $ tree temp
    temp
    ├── file1.out
    ├── file2.out
    ├── file3.txt
    └── file4.txt
    
    0 directories, 4 files
    $ 
    
    
    Python 3.7.5 (default, Dec 15 2019, 17:54:26) 
    [GCC 9.2.1 20190827 (Red Hat 9.2.1-1)] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> from pathlib import Path
    >>> folder = Path('temp')
    >>> outs = [_.name for _ in folder.glob('*.out')]
    >>> txts = [_.name for _ in folder.glob('*.txt')]
    >>> outs
    ['file1.out', 'file2.out']
    >>> txts
    ['file3.txt', 'file4.txt']
    >>> 
    

    Here is another way for concatenation of the contents:

    $ cat temp/file1.out 
    1
    2
    3
    4
    $ cat temp/file2.out 
    5
    6
    7
    8
    $ 
    
    >>> lines = [l for _ in folder.glob('*.out') for l in _.read_text().split()]
    >>> lines
    ['1', '2', '3', '4', '5', '6', '7', '8']
    >>> 
    

    I hope it helps.