Search code examples
shellubuntushcatrm

Shell Script issue with comparing file name to string in a loop


Here is the issue. I have a directory with over 100,000K files on an Ubuntu 14.04 Server. I need to process the files in the background so I wrote a shell script the cats the files to a larger one and then removes the file. However the issue that comes into play is that is also cats the process script and the output file. Any ideas?

#!/bin/sh
c=0
#loop through 1000 results at 1 time
d=1000 

  while [ $c -lt $d ]
    do
      filename=$(`ls | head -n 1`)
      #echo $filename

  if [ $filename == "process.sh" ]
    then
    break
  fi

  if [ $filename ==  "file.txt" ]
    then
    break
  fi

  cat `ls | head -n 1` >> file.txt
  rm `ls | head -n 1`
  #echo $c
  c=`expr $c + 1`

done

Solution

  • You should call ls | head -n 1 only once in each loop. After the checks you call ls | head -n 1 again and the result can be different (a concurrent process.sh still running or new files).
    How do you want to get files listed after file.txt ? You are breaking out of the loop, other files will be skipped. Do not change this in continue, because than you will keep assigning file.txt to filename.
    Always use double quotes for your vars (think about my file.txt), and you might want to get used to braces as well.

    Suppose your batch works fine and it has processed the last not-special file. "${filename}" will be empty! So start with testing if [ -f "${filename}" ], that will solve the problems with directories as well.

    I really hope you have permission for removing those files, so you won't get stuck processing the same file 1000 times.

    You shouldn't process ls output, so the alternative

    ls | egrep -v "file.txt|process.sh" | head -n 1
    

    is only a different way doing it wrong.

    When you get a "${filename}" and want to check this against a number of strings, you might want to use case "${filename}" in ... esac.

    When your files don't have newlines, you can use a find with xargs.

    # First test it
    find . -type f \( ! -name process.sh -a ! -name file.txt \) 2>/dev/null |
       head -10 | xargs -I % sh -c 'echo "File %"; echo "rm -f %"'
    # Looking nice?
    find . -type f \( ! -name process.sh -a ! -name file.txt \) 2>/dev/null |
       head -1000 | xargs -I % sh -c 'cat "%" > file.txt; rm -f "%"'