Search code examples
pythonbashawksubprocessls

Awk in Python's subprocess giving Invalid Expressions "'" error


I am trying to read the filename and filestamp for the most recent files of each of the two naming schemes as seen in the code. I have the following code, roughly:

#!/usr/bin/env python
import string, subprocess, sys, os
mypath = "/path/to/file"


my_cmd = (["ls -lt --full-time " + mypath + "*DAI*.txt",
          "ls -lt --full-time " + mypath + "*CA*.txt"]
         )
getmostrecent_cmd = "head -n 1"
getcols_cmd = "awk '{ print $6, $7, $9 }'"

for cmd in my_cmd:
    p1 = subprocess.Popen(cmd.split(), stdout=subprocess.PIPE)
    p2 = subprocess.Popen(getmostrecent_cmd.split(), stdin=p1.stdout, stdout=subprocess.PIPE)
    p3 = subprocess.Popen(getcols_cmd.split(), stdin=p2.stdout, stdout=subprocess.PIPE)
    output = p3.communicate()[0]

    print output

which give me the following error(s):

ls: cannot access /path/to/file/*DAI*.txt: No such file or directory
awk: '{
awk: ^ invalid char ''' in expression

ls: cannot access /path/to/file/*CA*.txt: No such file or directory
awk: '{
awk: ^ invalid char ''' in expression

But:

  1. I can use "ls -lt --full-time /path/to/file/*DAI*.txt" and get a result in the terminal. Why is it causing an issue with the same path?
  2. The awk command, when put in to subprocess directly, works fine; E.g. subprocess.Popen(["awk", ....], stdin=...., stdout=....) worked okay. But now I am getting an issue with the single quote. I tried triple quoting the string and escaping the single-quote.

Solution

  • I can use "ls -lt --full-time /path/to/file/DAI.txt" and get a result in the terminal. Why is it causing an issue with the same path?

    Glob expansion is performed by the shell. By default, shell is not involved in starting a new subprocess via Popen(). To this end you must pass the shell=True argument to it:

    p1 = subprocess.Popen(cmd.split(), stdout=subprocess.PIPE, shell=True)
    #                                                          ^^^^^^^^^^
    

    The awk command, when put in to subprocess directly, works fine; E.g. subprocess.Popen(["awk", ....], stdin=...., stdout=....) worked okay. But now I am getting an issue with the single quote. I tried triple quoting the string and escaping the single-quote.

    On the shell command line the single quotes in awk '{ print $6, $7, $9 }' are needed to make the string { print $6, $7, $9 } treated as a single argument (as well as to prevent the variable expansion). The single quotes are removed by the shell, and awk only sees the string { print $6, $7, $9 }. Since Popen() by default doesn't involve shell when executing the subprocess command and passes the arguments to the command verbatim, you don't need the single quotes:

    subprocess.Popen(["awk", "{ print $6, $7, $9 }"], stdin=...., stdout=....)