Search code examples
linuxamazon-web-servicesamazon-s3inotifyinotifywait

Inotifywait not uploading entire file


I have a script to upload a file from a directory to an s3 bucket.

My script is this

aws s3 sync <directory_of_files_to_upload> s3://<bucket-name>/

And when I run this script the whole file is uploaded properly. I want to run this script whenever a new file is uploaded so I decided to use inotify

my script is this

#!/bin/bash

inotifywait -m -r -e create "<directory_of_files_to_upload>" | while read NEWFILE
do
        aws s3 sync sunshine s3://turnaround-sunshine/
done

My problem is two fold

1.When I run this script is takes over the terminal so I can't do anything else as such

[ec2-user@ip-xxx-xx-xx-xx s3fs-fuse]$ ./Script.sh 
Setting up watches.  Beware: since -r was given, this may take a while!
Watches established.
  1. It runs when I upload a file from local but doesn't upload the entire file. The file in ec2 is 2.7MB but only ~350KB in s3. And it works properly (whole file gets uploaded) when I run the aws command myself without inotify. Also the program outputs (below) when I upload a file to the monitored directory.

    upload: sunshine/turnaroundtest.json to s3://turnaround-sunshine/turnaroundtest.json


Solution

    1. You can run the script in the background:

      ./Script.sh &
      

      Or you could just open a second terminal window to run it.

    2. Your script starts uploading the file as soon as it's created, which doesn't allow time for the writer to finish writing it. There's no reliable way to tell when the file is completed. The best way to address this is by changing the writing application. It should first write the file in another directory, then move it to this directory when it's done. As long as the two directories are in the same filesystem, moving is atomic, so the upload script will only see the completed file.

      If you can't use two directories for some reason, you could use filename patterns. It could write the file to <filename>.temp and then at the end rename it to <filename>. Your script could then ignore .temp files:

      while read newfile; 
      do 
          case "$newfile" in
          *.temp) ;;
          *) aws s3 sync sunshine s3://turnaround-sunshine/ ;;
          esac
      done