Search code examples
linuxbashsedxargs

Random behaviour when pipping find output to xargs and then to sed


I'm using the gnu version of these tools. I'm trying to unzip an archive and transform a file.

The file is "myfile.txt" and it appears in multiple folders in the archive- so I thought passing the full path to xarg would transform all files:

mkdir temp
unzip mypackage.zip -d temp

find temp -iname "myfile.txt" | xargs -I FILE sh -c "sed -e 's/replacethis/withthis/g' -e 's/replacethistoo/withthisaswell/g' FILE | tee FILE"
# List the files
find temp -iname "myfile.txt" | xargs -I FILE ls -l FILE
# Cat the files
find temp -iname "myfile.txt" | xargs -I FILE cat FILE
# Clean up 
rm -Rf temp

I run this script multiple times and have different outcomes which I don't understand.

Each time a different "myfile.txt" is modified, sometimes one of the "myfile.txt" files has 0 bytes

Why is this happening? It should be the same every time, shouldn't it? Is find only passing one, random, "myfile.txt" path to xargs each time I run this script?


Solution

  • Why is this happening? It should be the same every time shouldnt it?

    This happens because of a race condition between the two parallel operations of:

    • sed opening and reading the file
    • tee opening and truncating the file

    If tee wins, the file will be empty when sed reads it, and it will therefore be 0 bytes.

    If sed wins, it'll read (at least parts of) the file and you'll get some data.

    Since process scheduling is not predictable, you risk seeing different results each time.