Search code examples
bashsedglobshopt

sed fails when "shopt -s nullglob" is set


Some days ago I started a little bash script that should sum up the number of pages and file size of all PDF's in a folder. It's working quite well now but there's still one thing I don't understand.

Why is sed always failing if shopt -s nullglob is set? Does somebody know why this happens?

I'm working with GNU Bash 4.3 and sed 4.2.2 in Ubuntu 14.04.

set -u
set -e

folder=$1

overallfilesize=0
overallpages=0
numberoffiles=0

#If glob fails nothing should be returned
shopt -s nullglob

for file in $folder/*.pdf
do

  # Disable empty string if glob fails
  # (Necessary because otherwise sed fails ?:|)
  #shopt -u nullglob

  # This command is allowed to fail
  set +e
  pdfinfo="$(pdfinfo "$file" 2> /dev/null)"
  ret=$? 
  set -e  

  if [[ $ret -eq 0 ]]
  then 
    #Remove every non digit in the result
    sedstring='s/[^0-9]//g'
    filesize=$(echo -e "$pdfinfo" | grep -m 1 "File size:" | sed $sedstring)
    pages=$(echo -e "$pdfinfo" | grep -m 1 "Pages:" | sed $sedstring)

    overallfilesize=$(($overallfilesize + $filesize))  
    overallpages=$(($overallpages+$pages))  
    numberoffiles=$(($numberoffiles+1))  
  fi

done

echo -e "Processed files: $numberoffiles"
echo -e "Pagesum: $overallpages"
echo -e "Filesizesum [Bytes]: $overallfilesize"

Solution

  • Here's a simpler test case for reproducing your problem:

    #!/bin/bash
    shopt -s nullglob
    pattern='s/[^0-9]//g'
    sed $pattern <<< foo42
    

    Expected output:

    42
    

    Actual output:

    Usage: sed [OPTION]... {script-only-if-no-other-script} [input-file]...
    (sed usage follows)
    

    This happens because s/[^0-9]//g is a valid glob (matching a dir structure like like s/c/g), and you asked bash to interpret it. Since you don't have a matching file, nullglob kicks in and removes the pattern entirely.

    Double quoting prevents word splitting and glob interpretation, which is almost always what you want:

    #!/bin/bash
    shopt -s nullglob
    pattern='s/[^0-9]//g'
    sed "$pattern" <<< foo42
    

    This produces the expected output.

    You should always double quote all your variable references, unless you have a specific reason not to.