Search code examples
pythonglobpathlib

How to iterate through files using pathlib.glob() when files names have digits of different length


My Directory looks like this:

P1_SAMPLE.csv
P2_SAMPLE.csv
P3_SAMPLE.csv
P11_SAMPLE.csv
P12_SAMPLE.csv
P13_SAMPLE.csv

My code looks like this:

from pathlib import Path

file_path = r'C:\Users\HP\Desktop\My Directory'

for fle in Path(file_path).glob('P*_SAMPLE.csv'):
    number = fle.name[1]
    print(number)

This gives output:

1
2
3
1
1
1

How do I make the code output the actual full digits for each file, like this:

1
2
3
11
12
13

Would prefer to use fle.name[] if possible. Many thanks in advance!


Solution

  • Use a regular expression:

    import re
    
    for fle in Path(file_path).glob('P*_SAMPLE.csv'):
        m = re.search(r'P(\d+)_SAMPLE.csv', fle.name)
        print(m.group(1))
    

    You can even simplify this to:

    m = re.search(r'(\d+)', fle.name)
    

    Since a number only appears in one place within the filename.