So, I am trying to output all text files in my directory that contains any of several regular expressions.
Here is a sample regular expression that searches for a Phone number in a file
#Search for Phone Numbers
regex2 =r'\d\d\d[-]\d\d\d[-]\d\d\d\d'
Here is my code to get all files, but am confused as to where to put the regex at.
import glob
folder_path = "C:\Temp"
file_pattern = "\*.txt"
search_string = "hello"
match_list = []
folder_contents = glob.glob(folder_path + file_pattern)
for file in folder_contents:
print("Checking", file)
read_file = open(file, 'rt').read()
if search_string in read_file:
match_list.append(file)
print("Files containing search string")
for file in match_list:
print(file)
Here is another method to compiling all txt files in my directory:
import glob
import errno
path = '/home//*.txt' #note C:
files = glob.glob(path)
for name in files:
with open(name) as f:
for line in f:
split = line.split()
if split:
print(line.split())
I tried putting my regex in the if statement in each of the above but gives me errors. Any ideas?
import re
# Define your regex
regex2 = re.compile(r'\d\d\d[-]\d\d\d[-]\d\d\d\d')
# Read files...
# Check if we have matches in the file content
matches = regex2.findall(read_file)
if matches:
match_list.append(file)
print('file:', file)
print('matches:', matches)