In the code below, the program is getting string data from the user and converting it to ascii and hex and searching all .log and .txt files in a certain directory for the string in plain string, hex, and ascii values. The program prints the line # , the string type found, and the file path if the string is found. However, not only do I want it to print the files if the string is found, I would also like it to print the file and path and string searched for in the files that were searched but not found. I'm a newbie, so please don't be frustrated with the simplicity of the problem. I'm still learning. Thanks. Code below:
elif searchType =='2':
print "\nDirectory to be searched: " + directory
print "\nFile result2.log will be created in: c:\Temp_log_files."
paths = "c:\\Temp_log_files\\result2.log"
temp = file(paths, "w")
userstring = raw_input("Enter a string name to search: ")
userStrHEX = userstring.encode('hex')
userStrASCII = ''.join(str(ord(char)) for char in userstring)
regex = re.compile(r"(%s|%s|%s)" % ( re.escape( userstring ), re.escape( userStrHEX ), re.escape( userStrASCII )))
goby = raw_input("Press Enter to begin search (search ignores whitespace)!\n")
def walk_dir(directory, extensions=""):
for path, dirs, files in os.walk(directory):
for name in files:
if name.endswith(extensions):
yield os.path.join(path, name)
whitespace = re.compile(r'\s+')
for line in fileinput.input(walk_dir(directory, (".log", ".txt"))):
result = regex.search(whitespace.sub('', line))
if result:
template = "\nLine: {0}\nFile: {1}\nString Type: {2}\n\n"
output = template.format(fileinput.filelineno(), fileinput.filename(), result.group())
print output
temp.write(output)
break
elif not result:
template = "\nLine: {0}\nString not found in File: {1}\nString Type: {2}\n\n"
output = template.format(fileinput.filelineno(), fileinput.filename(), result.group())
print output
temp.write(output)
else:
print "There are no files in the directory!!!"
Folks, I think user706808 wants to search for all occurrences of searchstring in file and:
Can you confirm? Assuming that's what you want, all you need to change is to split up this megaline of code...
for line in fileinput.input(walk_dir(directory, (".log", ".txt"))):
into...
for curPathname in walk_dir(directory, (".log", ".txt")):
nOccurrences = 0
for line in fileinput.input(curPathname):
result = regex.search(whitespace.sub('', line))
if result:
...
nOccurrences += 1 # ignores multiple matches on same line
# You don't need an 'elif not result' line, since that should happen on a per-file basis
# Only get here when we reach EOF
if (nOccurrences == 0):
NOW HERE print the "not found" message, for curPathname
# else you could print "found %d occurrences of %s in ..."
Sound good?
By the way you can now simply refer to fileinput.filename() as 'curPathname'.
(Also you might like to abstract the functionality into a function find_occurrences(searchstring,pathname) which returns int or Boolean 'nOccurrences'.)