Search code examples
pythonglob

Output all txt files in directory that contains a series of regular expression-python


So, I am trying to output all text files in my directory that contains any of several regular expressions.

Here is a sample regular expression that searches for a Phone number in a file

#Search for Phone Numbers
regex2 =r'\d\d\d[-]\d\d\d[-]\d\d\d\d'

Here is my code to get all files, but am confused as to where to put the regex at.

import glob

folder_path = "C:\Temp"
file_pattern = "\*.txt"
search_string = "hello"

match_list = []

folder_contents = glob.glob(folder_path + file_pattern)

for file in folder_contents:
    print("Checking", file)
    read_file = open(file, 'rt').read()

    if search_string in read_file:
        match_list.append(file)

print("Files containing search string")
for file in match_list:
    print(file)

Here is another method to compiling all txt files in my directory:

import glob
import errno
path = '/home//*.txt' #note C:
files = glob.glob(path)
for name in files:
    with open(name) as f:
        for line in f:
            split = line.split()
            if split:
                print(line.split())

I tried putting my regex in the if statement in each of the above but gives me errors. Any ideas?


Solution

  • import re
    
    # Define your regex
    regex2 = re.compile(r'\d\d\d[-]\d\d\d[-]\d\d\d\d')
    
    # Read files...
    
    # Check if we have matches in the file content
    matches = regex2.findall(read_file)
    if matches:
        match_list.append(file)
        print('file:', file)
        print('matches:', matches)