I'm writing a Python script which expects a regex pattern and a file name and looks for that regex pattern within the file.
By default, the script requires a file to work on.
I want to change the script so by default it would take it's input from STDIN unless a file is specified (-f filename).
My code looks like so:
#!/usr/bin/env python3
# This Python script searches for lines matching regular expression -r (--regex) in file/s -f (--files).
import re
import argparse
#import sys
class colored:
CYAN = '\033[96m'
UNDERLINE = '\033[4m'
END = '\033[0m'
def main(regex, file, underline, color):
pattern = re.compile(regex)
try:
for i, line in enumerate(open(file, encoding="ascii")):
for match in re.finditer(pattern, line):
message = "Pattern {} was found on file: {} in line {}. The line is: ".format(regex, file, i+1)
if args.color and args.underline:
#message = "Pattern {} was found on file: {} in line {}. The line is: ".format(regex, file, i+1)
l = len(line)
print(message + colored.CYAN + line + colored.END, end="")
print(" " ,"^" * l)
break
if args.underline:
l = len(line)
print(message + line, end="")
print(" " ,"^" * l)
break
if args.color:
print(message + colored.CYAN + line + colored.END, end="")
break
if args.machine:
print("{}:{}:{}".format(file, i+1, line), end="")
break
else:
print(message + line, end="")
break
except FileNotFoundError:
print("File not found, please supply")
pass
if __name__ == "__main__":
parser = argparse.ArgumentParser(description='Python regex finder', epilog = './python_parser.py --regex [pattern] --files [file]')
requiredNamed = parser.add_argument_group('required named arguments')
requiredNamed.add_argument('-r', '--regex',
help='regex pattern', required=True)
parser.add_argument('-f', '--file',
help='file to search pattern inside')
parser.add_argument('-u', '--underline', action='store_true',
help='underline')
parser.add_argument('-c', '--color', action='store_true',
help='color')
parser.add_argument('-m', '--machine', action='store_true',
help='machine')
args = parser.parse_args()
main(args.regex, args.file, args.underline, args.color)
You can see how a run looks here.
I tried using the answer from this SO question, but getting the following error:
for i, line in enumerate(open(file, encoding="ascii")):
TypeError: expected str, bytes or os.PathLike object, not _io.TextIOWrapper
Edit #1:
This is the file:
Itai
# something
uuu
UuU
# Itai
# this is a test
this is a test without comment
sjhsg763
3989746
# ddd ksjdj #kkl
I get the above error when I supply no file.
Edit#2:
When I change the file argument to that:
parser.add_argument('-f', '--file',
help='file to search pattern inside',
default=sys.stdin,
type=argparse.FileType('r'),
nargs='?'
)
And then run the script like so:
~ echo Itai | ./python_parser.py -r "[a-z]" -m
Traceback (most recent call last):
File "./python_parser.py", line 59, in <module>
main(args.regex, args.file, args.underline, args.color)
File "./python_parser.py", line 16, in main
for i, line in enumerate(open(file, encoding="ascii")):
TypeError: expected str, bytes or os.PathLike object, not NoneType
➜ ~
args.file = tmpfile
which is a file in the same directory where the script runs.
What am I doing wrong?
You wrote this:
def main(regex, file, underline, color):
...
for i, line in enumerate(open(file, encoding="ascii")):
You have some confusion about whether file
denotes a filename or an open file descriptor. You want it to be an open file descriptor, so you may pass in sys.stdin
. That means main()
should not attempt to open()
, rather it should rely on the caller to pass in an already open file descriptor.
Pushing the responsibility for calling open()
up into main()
will let you assign file = sys.stdin
by default, and then re-assign the result of open()
if it turns out that a filename was specified.