Search code examples
pythonlinuxbashpython-2.7file-descriptor

In Python, how are custom file descriptors used for input and output, including defaults setup and final closing?


I'm looking to understand how custom file descriptors work in Python for input, output, defaults setup, and final closing. I have a file in Bash that does exactly what I want to do in Python. Can anyone tell me how this would be done in Python? I'm using Python 2.7.5, Bash 4.2, and executing on CentOS 7.3.

setup

$ echo "input string" > input

bash_fd.sh

#!/bin/bash

# Demonstration of custom file descriptors in Bash
#     3  script input (scrin)
#     4  script output (scrout)
#     5  script error (screrr)
#     6  script data (scrdata, demo: JSON return payload)
#     7  script log (scrlog)

fd_open()
{
   ### Provide defaults for file descriptors 3-7 ONLY if the FDs are undefined
   { >&3; } 2>/dev/null || exec 3<&0            # dup scrin   to stdin
   { >&4; } 2>/dev/null || exec 4>&1            # dup scrout  to stdout
   { >&5; } 2>/dev/null || exec 5>&2            # dup screrr  to stderr
   { >&6; } 2>/dev/null || exec 6>/dev/null     # set scrdata to /dev/null
   { >&7; } 2>/dev/null || exec 7>/dev/null     # set scrlog  to /dev/null
}

fd_close()
{
   # Close all file descriptors
   exec 3>&-
   exec 4>&-
   exec 5>&-
   exec 6>&-
   exec 7>&-
}

main()
{
   fd_open                                      # Ensure 

   echo "[$(date)] Program beginning" >&7       # scrlog

   echo -n 'Enter a message: ' >&4              # scrout
   read MSG <&3                                 # scrin

   echo "Read message $MSG" >&4                 # scrout
   echo "[screrr] Read message $MSG" >&5        # screrr

   echo "{\"msg\": \"$MSG\"}" >&6               # scrdata: return JSON payload
   echo "[$(date)] Program finishing: $MSG" >&7 # scrlog

   fd_close

   return ${1:-0}                               # return status code
}

# For demonstration purposes, $1 is the return code returned when calling main
main "$1"

invocation

$ ./bash_fd.sh 37 3<input 4>scrout 5>screrr 6>scrdata 7>scrlog
$

return code

$ echo $?
37

generated files

$ cat scrout
Enter a message: Read message input string

$ cat screrr
[screrr] Read message input string

$ cat scrdata
{"msg": "input string"}

$ cat scrlog
[Wed Jun 14 21:33:24 EDT 2017] Program beginning
[Wed Jun 14 21:33:24 EDT 2017] Program finishing: input string

Any help in translating the above Bash script to Python will really help me understand Python and custom file descriptors and will be greatly appreciated.


Solution

  • Python 2's file object is a fairly thin wrapper over C's stdio FILE structure, which itself contains the corresponding descriptor (an integer). It's not a coincidence that in many places, the docs refer to the underlying/related stdio things.

    • Each time you create a file object (open()), a descriptor corresponding to the file is opened and used in all I/O operations with the object.

      • You can get it with <file>.fileno().
      • conversely, if you have a raw descriptor, you can wrap it with a file object with os.fdopen().
        • e.g. if you commanded bash to redirect specific descriptors of your script, it has opened the corresponding descriptors for your subprocess.
    • When a file object is closed or garbage-collected, the underlying descriptor is closed, too.

    • the os module has a few other functions to work with descriptors that mirror corresponding C functions such as os.dup().

    Generally, you should use file objects and don't bother with their underlying descriptors. You can do this even with functions that return raw descriptors like with os.pipe().


    Examples:

    (entities in angle brackets are pseudocode, showing what is to be inserted there)

    Check if a descriptor exists:

    While How to check if a given file descriptor stored in a variable is still valid? suggests (UNIX-only) fcntl or (portable) dup as the least intrusive ways, since you're going to use it via a file object, it's best to just attempt to:

    import os,errno    
    <...>
    try: f = os.fdopen(<fd>)
    except OSError as e:
        if e.errno!=errno.EBADF: raise
        else:
            # actions when doesn't exist, maybe create `f' some other way
    else:
        #actions when exists
    # use `f'
    

    Duplicate FDs

    Not really needed - you can just assign e.g. f = sys.stdin depending on a condition and use f. The only case where you really need this is if you must provide the extra FDs to other processes.

    E.g. to duplicate an FD of a file object and create another file object over the duplicate:

    os.dup2(old_f.fileno(),<new_fd>)
    new_f = os.fdopen(<new_fd>)
    

    Reading from/Writing to/closing an FD

    Read from/write to/close/whatever the file object wrapping that FD. See Reading and Writing files - Python tutorial, the only difference is if you have a raw FD, you create the file object with os.fdopen() instead of open().