I need to verify that all images mentioned in a csv are present inside a folder. I wrote a small shell script for that
#!/bin/zsh
red='\033[0;31m'
color_Off='\033[0m'
csvfile=$1
imgpath=$2
cat $csvfile | while IFS=, read -r filename rurl
do
if [ -f "${imgpath}/${filename}" ]
then
echo -n
else
echo -e "$filename ${red}MISSING${color_Off}"
fi
done
My CSV looks something like
Image1.jpg,detail-1
Image2.jpg,detail-1
Image3.jpg,detail-1
The csv was created by excel.
Now all 3 images are present in imgpath
but for some reason my output says
Image1.jpg MISSING
Upon using zsh -x
to run the script i found that my CSV file has a BOM at the very beginning making the image name as \ufeffImage1.jpg
which is causing the whole issue.
How can I ignore a BOM(byte-order marker) in a while read operation?
zsh provides a parameter expansion (also available in POSIX shells) to remove a prefix: ${var#prefix}
will expand to $var
with prefix
removed from the front of the string.
zsh also, like ksh93 and bash, supports ANSI C-like string syntax: $'\ufeff'
refers to the Unicode sequence for a BOM.
Combining these, one can refer to ${filename#$'\ufeff'}
to refer to the content of $filename
but with the Unicode sequence for a BOM removed if it's present at the front.
The below also makes some changes for better performance, more reliable behavior with odd filenames, and compatibility with non-zsh shells.
#!/bin/zsh
red='\033[0;31m'
color_Off='\033[0m'
csvfile=$1
imgpath=$2
while IFS=, read -r filename rurl; do
filename=${filename#$'\ufeff'}
if ! [ -f "${imgpath}/${filename}" ]; then
printf '%s %bMISSING%b\n' "$filename" "$red" "$color_Off"
fi
done <"$csvfile"
Notes on changes unrelated to the specific fix:
echo -e
with printf
lets us pick which specific variables get escape sequences expanded: %s
for filenames means backslashes and other escapes in them are unmodified, whereas %b
for $red
and $color_Off
ensures that we do process highlighting for them.cat $csvfile |
with < "$csvfile"
avoids the overhead of starting up a separate cat
process, and ensures that your while read
loop is run in the same shell as the rest of your script rather than a subshell (which may or may not be an issue for zsh, but is a problem with bash when run without the non-default lastpipe
flag).echo -n
isn't reliable as a noop: some shells print -n
as output, and the POSIX echo
standard, by marking behavior when -n
is present as undefined, permits this. If you need a noop, :
or true
is a better choice; but in this case we can just invert the test and move the else
path into the truth path.