Search code examples
bashfor-loopspace

for loop through files with spaces in their names


I have a directory or zipped files, each containing a group of XML files I need to make a script that will extract XML files from those ZIPs if they contain a certain string

for z in `ls /path/to/archives/*.zip`
do for f in `unzip -l $z | grep 'xml' | awk -F" " '{print "$4" "$5}'`
  do r = $( unzip -p $z $f | grep $string )
    if [ '$r' != '' ]
    unzip $z $f
    fi
  done
done

When this runs, zip file A.zip containing a file called 'my file.xml' causes the loop to handle it as 2 files 'my' and 'file.xml' unzip then tries to extract file my from A.zip which fails

Any ideas on how to force the for loop not to consider the space in the file name as a separator?


Solution

  • Use the -Z1 option of unzip instead of -l. It outputs one file per line with no additional information. You should read its output instead of loop over it with for to prevent word splitting. You might still have problems with filenames containing a newline (but I wasn't able to zip them, $'a\nb' was stored as a^Jb and extracted as ab).

    Also, your if is missing a then.

    Also, don't parse the output of ls, you can iterate over the globbed file mask itself.

    You don't need to check that grep outputs anything, just run it with -q and check its exit status.

    Don't forget to doublequote variables that might contain whitespace or other special characters.

    for z in /path/to/archives/*.zip ; do
        while IFS= read -r f ;  do
            if unzip -p "$z" "$f" | grep -q "$string" ; then
                unzip "$z" "$f"
            fi
        done < <(unzip -Z1 "$z" '*.xml')
    done