Search code examples
bashshellgrepunzipcut

Using unzip, grep and cut


I'm looking at some legacy code and can't figure out what it's meant to do:

for date in 19960822 19960823 19960824 19960825 19960826 \
        19960827 19960828 19960829 19960830 19960831
do
   echo "copying files from date $date..."
   unzip /mnt/cdrom/$date.zip \
  `grep $date $DIR/files.eng | cut -d/ -f2` > /dev/null
done

What's that backslash \ doing after the unzip command? Why would you continue an unzip command?

Also, cut uses a custom flag (a forward slash /) and extracts the second column but then redirects to /dev/null? Why?

Note: I'm on OS X.


Solution

  • Backslashes can escape the end of line so the command can continue onto the next line. Your code is interpreted in this way:

    for date in 19960822 19960823 19960824 19960825 19960826 19960827 19960828 19960829 19960830 19960831
    do echo "copying files from date $date..."
       unzip /mnt/cdrom/$date.zip `grep $date $DIR/files.eng | cut -d/ -f2` > /dev/null
    done
    
    1. The >/dev/null is redirecting stdout from the unzip command, not the cut. unzip -q (quiet mode) I believe would do the same thing. It looks like this was a failed attempt to suppress the 'not matched' errors being thrown. Use 2>/dev/null.
    2. grep $date $DIR/files.eng | cut -d/ -f2 is a substituted command; back ticks are deprecated in favor of the $( ) method. It is currently representing the "list" according to unzip usage: Usage: unzip [-Z] [-opts[modifiers]] file[.zip] [list] [-x xlist] [-d exdir] Default action is to extract files in list, except those in xlist, to exdir;
    3. The '/' is cut's delimiter. Its purpose here is to remove the file path so the substituted command returns only filenames. A better way to accomplish this is with xargs and basename: grep $date $DIR/files.eng | xargs -n1 basename
    4. unzip is throwing the 'not matched' errors only because the zipped file does not contain the xml filenames found in files.eng.

    Try using:

    unzip /mnt/cdrom/${date}.zip $(grep ${date} ${DIR}/files.eng | xargs -n1 basename) 2>/dev/null
    

    EDIT:

    To clarify, as I see a number of comments about sloppy programming; it is sloppy, but not exactly incorrect syntax-wise. It's targeting specific, predetermined files to be updated. Other contents of the .zip file that do not match the list will not be extracted. Your unzip arguments are as follows:

    $1 zipped_file="/mnt/cdrom/${date}.zip"
    $2 include_list="$(grep ${date} ${DIR}/files.eng | xargs -n1 basename)"
    $3 exclude_list=