Search code examples
python-3.xstdinargparse

Python: read from STDIN unless a file is specified, how is it done?


I'm writing a Python script which expects a regex pattern and a file name and looks for that regex pattern within the file.

By default, the script requires a file to work on.

I want to change the script so by default it would take it's input from STDIN unless a file is specified (-f filename).

My code looks like so:

#!/usr/bin/env python3
# This Python script searches for lines matching regular expression -r (--regex) in file/s -f (--files).

import re
import argparse
#import sys

class colored:
   CYAN = '\033[96m'
   UNDERLINE = '\033[4m'
   END = '\033[0m'

def main(regex, file, underline, color):
    pattern = re.compile(regex)
    try:
        for i, line in enumerate(open(file, encoding="ascii")):
            for match in re.finditer(pattern, line):
                message = "Pattern {} was found on file: {} in line {}. The line is: ".format(regex, file, i+1)
                if args.color and args.underline:
                    #message = "Pattern {} was found on file: {} in line {}. The line is: ".format(regex, file, i+1)
                    l = len(line)
                    print(message + colored.CYAN + line + colored.END, end="")
                    print("                                                                " ,"^" * l)
                    break
                if args.underline:
                    l = len(line)
                    print(message + line, end="")
                    print("                                                                " ,"^" * l)
                    break
                if args.color:
                    print(message + colored.CYAN + line + colored.END, end="")
                    break
                if args.machine:
                    print("{}:{}:{}".format(file, i+1, line), end="")
                    break
                else:
                    print(message + line, end="")
                    break

    except FileNotFoundError:
        print("File not found, please supply")
        pass

if __name__ == "__main__":
    parser = argparse.ArgumentParser(description='Python regex finder', epilog = './python_parser.py --regex [pattern] --files [file]')
    requiredNamed = parser.add_argument_group('required named arguments')
    requiredNamed.add_argument('-r', '--regex',
                        help='regex pattern', required=True)
    parser.add_argument('-f', '--file',
                        help='file to search pattern inside')
    parser.add_argument('-u', '--underline', action='store_true',
                        help='underline')
    parser.add_argument('-c', '--color', action='store_true',
                        help='color')
    parser.add_argument('-m', '--machine', action='store_true',
                        help='machine')
    args = parser.parse_args()

    main(args.regex, args.file, args.underline, args.color)

You can see how a run looks here.

I tried using the answer from this SO question, but getting the following error:

for i, line in enumerate(open(file, encoding="ascii")):
TypeError: expected str, bytes or os.PathLike object, not _io.TextIOWrapper

Edit #1:

This is the file:

Itai
# something
uuu
UuU
# Itai
# this is a test
this is a test without comment
sjhsg763
3989746
# ddd ksjdj #kkl

I get the above error when I supply no file.

Edit#2:

When I change the file argument to that:

parser.add_argument('-f', '--file',
                        help='file to search pattern inside',
                        default=sys.stdin,
                        type=argparse.FileType('r'),
                        nargs='?'
                        )

And then run the script like so:

~ echo Itai | ./python_parser.py -r "[a-z]" -m
Traceback (most recent call last):
  File "./python_parser.py", line 59, in <module>
    main(args.regex, args.file, args.underline, args.color)
  File "./python_parser.py", line 16, in main
    for i, line in enumerate(open(file, encoding="ascii")):
TypeError: expected str, bytes or os.PathLike object, not NoneType
➜  ~

args.file = tmpfile

which is a file in the same directory where the script runs.

What am I doing wrong?


Solution

  • You wrote this:

    def main(regex, file, underline, color):
    ...
            for i, line in enumerate(open(file, encoding="ascii")):
    

    You have some confusion about whether file denotes a filename or an open file descriptor. You want it to be an open file descriptor, so you may pass in sys.stdin. That means main() should not attempt to open(), rather it should rely on the caller to pass in an already open file descriptor.

    Pushing the responsibility for calling open() up into main() will let you assign file = sys.stdin by default, and then re-assign the result of open() if it turns out that a filename was specified.