Search code examples
pythoncountoperating-systemos.walklistdir

Count numbers of subdirectories and files with special conditions in Python


I have a folder test from which I want count the numbers of projects, buildings and txt_files by the following rule.

For the number of projects, it simply equals to the number of subfolders of the first layer subdirectories. For the number of buildings, it equals to the numbers of the first layer of subdirectories if it does not have the second layer of subdirectories, otherwise, count the second layer of subdirectories.

├─a
│  ├─a1
│  ├─a2
│  └─a3
│      ├─a3_1.txt
│      ├─a3_2.geojson
│      └─a3_3.txt
├─b
│  ├─b1
│  ├─b2
│  ├─b3
│  └─b4
├─c
│  ├─c1
│  ├─c2
│  └─c3
├─d
└─123.txt

For the following example structures: num_projects is 4 which contains the first layer subfolders: a, b, c, d; while num_buildings is 11, which contains subdirectories: a1, a2, a3, b1, b2, b3, b4, c1, c2, c3 and d; and num_txt is 3.

My solution so far:

import os

path = os.getcwd()

num_projects = 0 
num_buildings = 0 
num_txt = 0 

for subdirs in os.listdir(path):
    num_projects += 1    

for root, dirnames, filenames in os.walk(path):
    for dirname in dirnames:
        num_buildings += 1
    for filename in filenames:
        if filename[-4:] == ".txt":
            num_txt += 1

print("Number of projects is %d, number of buildings is %d, number of txt files is %d." %(num_projects, num_buildings, num_txt))   

Output:

Number of projects is 5, number of buildings is 17, number of txt files is 3.

The num_projects and num_buildings are wrong. How could I make it correct? Thanks.


Solution

  • os.walk() is a generator and should be able to handle many directories (and subdirectories) without worrying about memory.

    It's not elegant but try this:

    import os
    
    projects = 0
    buildings = 0
    txt_files = 0
    
    path = os.getcwd()
    
    for root, directories, files in os.walk(path):
        if root == path:
            projects = len(directories)
            for sub_dir in directories:
                full_dir = os.path.join(root, sub_dir)
                for root_, directories_, files_ in os.walk(full_dir):
                    if root_ == full_dir:
                        if directories_ == []:
                            buildings += 1
                        else:
                            buildings += (len(directories_))
    
        for i in files:
            if i.endswith('.txt'):
                txt_files += 1
    
    print("There are {} projects, {} buildings and {} text files".format(projects, buildings, txt_files))