Search code examples
bashshell

How to iterate LIVE through a directory?


I am running a shell script via bash that iterates through a folders' files via

for file in ./folder/*.xml; do ...

However, it can happen that some files are changed or deleted while the script is running which results in an error.

Basically, I want to run the same script multiple times at the same time to make the process faster. The script imports files and deletes them afterwards.

Is there a possibility for a while loop that reads the files freshly with each iteration?


Solution

  • I suggest making a slight change to what you are currently doing. Instead of running the script multiple times to work on the same set of files, I'd make it work at one file at a time in parallel. Here's a version where the script is using xargs -P to restart itself given one of the .xml files in folder:

    #!/bin/bash
    
    if [[ $# -eq 1 ]]; then
        echo "$$: working with: $1"
    
        # do the work with the file $1 here
    
        # remove the file or move it to a "done" folder:
        if [[ -d folder/done ]]; then
            mv -f "$1" folder/done
        fi
    else
        mkdir -p folder/done
        shopt -s nullglob
        files=( folder/*.xml )
        if (( ${#files} > 0 )); then
            echo "${files[@]}" | xargs -n1 -P0 "$0"
        else
            echo "no files to process"
        fi
    fi
    

    If new .xml files may be added to folder while the script is running, then run the script in a loop. Below is an alternative to xargs -P which spawns the processes manually and loops to take care of any files added while the processing is going on.

    #!/bin/bash
    
    work () {
        echo "$$: working with: $1"
    
        # do the work with the file $1 here
    
        # remove the file or move it to a "done" folder:
        if [[ -d folder/done ]]; then
            mv -f "$1" folder/done
        fi
    }
    
    mkdir -p folder/done
    shopt -s nullglob
    
    processed=1
    while (( processed > 0 )); do
        processed=0
        for file in folder/*.xml
        do
            (( ++processed ))
            # spawn a worker process:
            work "$file" &
        done
    
        wait # for all the started worker processes
    
        echo "processed $processed files"
    done