Suppose I want to implement a Python script with the following signature:
myscript.py INPUT OUTPUT
...where INPUT
and OUTPUT
stand for the paths of files the script will read from and write to, respectively.
The code for implementing a script with such a signature may feature the following construct:
with open(inputarg, 'r') as instream, open(outputarg, 'w') as outstream:
...
...where here the inputarg
and outputarg
variables hold the file paths (which are strings) passed to the script via its INPUT
and OUTPUT
command-line arguments.
Nothing special or unusual so far.
But now, suppose that, for version 2 of the script, I want to give the user the option to pass the special value -
for either (or both) of its arguments, to indicate that the script should, respectively, read from stdin
and write to stdout
.
In other words, I want that all the forms below produce the same results:
myscript.py INPUT OUTPUT
myscript.py - OUTPUT <INPUT
myscript.py INPUT - >OUTPUT
myscript.py - - <INPUT >OUTPUT
Now, the with
statement given earlier is no longer suitable. For one thing, either expression open('-', 'r')
or open('-', 'w')
would raise an exception:
FileNotFoundError: [Errno 2] No such file or directory: '-'
I have not been able to come up with a convenient way to extend the with
-based construct above to accommodate the desired new functionality.
For example, this variation won't work (on top of being somewhat unwieldy), because sys.stdin
and sys.stdout
do not implement the context manager interface:
with sys.stdin if inputarg == '-' else open(inputarg, 'r'), \
sys.stdout if outputarg == '-' else open(outputarg, 'w'):
...
The only thing I can come up (maybe) is to define a minimal pass-through wrapper class that implements the context manager interface, like this:
class stream_wrapper(object):
def __init__(self, stream):
self.__dict__['_stream'] = stream
def __getattr__(self, attr):
return getattr(self._stream, attr)
def __setattr__(self, attr, value):
return setattr(self._stream, attr, value)
def close(self, _std=set(sys.stdin, sys.stdout)):
if not self._stream in _std:
self._stream.close()
def __enter__(self):
return self._stream
def __exit__(self, *args):
return self.close()
...and then write the with
statement like this:
with stream_wrapper(sys.stdin if inputarg == '-' else open(inputarg, 'r')), \
stream_wrapper(sys.stdout if outputarg == '-' else open(outputarg, 'w')):
...
The stream_wrapper
class strikes me as a lot of drama for what it achieves (assuming that it works at all: I have not tested it!).
Is there a simpler way to get the same results?
IMPORTANT: Any solution to this problem must take care never to close sys.stdin
or sys.stdout
.
Using a contextlib.contextmanager this can be managed with something like:
from contextlib import contextmanager
import sys
@contextmanager
def stream(arg, mode='r'):
if mode not in ('r', 'w'):
raise ValueError('mode not "r" or "w"')
if arg == '-':
yield sys.stdin if mode == 'r' else sys.stdout
else:
with open(arg, mode) as f:
yield f
with (stream(sys.argv[1], 'r') as fin,
stream(sys.argv[2], 'w') as fout
):
for line in fin:
fout.write(line)
If not familiar with contextmanager
it basically runs the code up to the yield
on entry and after the yield
on exit. Wrapping the yield
of the open
in a with
ensures it is closed if used.