I rsync static HTML files to our webserver with this line:
rsync -rlpcgoDvz --delete _site/* [email protected]:/var/www/x/public/
To warm up the webserver's cache I'd like to fetch the synced files right after I rsynced them. With wget http://www.xx.de/bla/foo.html
or curl
.
Is there a way to tell rsync
or the bash
shell to do that?
You can do that in several ways, I suppose. One I came up with:
Step 1:
Add --log-file
option to rsync, so you get the log of actions taken in the end. For instance:
rsync -rlpcgoDvz --log-file=log --delete _site/* [email protected]:/var/www/x/public/
This log would look like that, for instance (log for transferring 4 files, named "file1", "file2", "file3", "file4"):
2015/02/13 12:52:11 [54686] receiving file list
2015/02/13 12:52:11 [54686] >f+++++++ file1
2015/02/13 12:52:11 [54686] >f+++++++ file2
2015/02/13 12:52:11 [54686] >f+++++++ file3
2015/02/13 12:52:11 [54686] >f+++++++ file4
We're interested in the >f+++++++
field, and the next one which is the name of the file. See this answer for a short explanation of what to expect here.
Step 2:
After transfer is complete, pick up the file names and call wget
on each:
cat log | grep ">f++++++" | cut -d \ -f 5 | while read -r filename; do wget "http://www.xx.de/$filename"; done
Breaking it piece by piece:
cat log | \ # Pipe the file
grep ">f++++++" | \ # Take only interesting lines.
# Here - only files which were not present
# on the other end.
cut -d \ -f 5 | \ # Take the file name.
while read -r filename; do wget "http://www.xx.de/$filename"; done
You might need to adjust some file paths, etc. to fit your use case.