Search code examples
pythonregexglobpathlib

Python Glob Orignal Directories using Regex


I have a bunch of directories that follow this naming convention:

foo
foo.v2
foo_v01
foo_v02
foo_v03
bar
bar.v3
bar_v01
bar_v02

I am looking for a regex expression to only glob original directories (foo and foo_v01; bar and bar_v01). I'm using the Path.glob(pattern) from pathlib to glob the files. I would like to glob the original directories specifically by the name, not by timestamp.


Solution

  • This works for your examples (if it doesn't work for others, please add them to your question)

    r'^(?!\w+_v0[2-9])(\w+)$'
    

    Explanation:

    (\w+) means that it should match any combination of letters, underscore, and numbers, one or more times.

    (?!\w+_v0[2-9]) means that if it matches any combination the same as above, followed by _v0<any_number_from_2_to_9> (versions above 1), it should discard the match.