Search code examples
sshrsyncxargsgnu-parallel

Remote rsync in parallel


I'm trying to run rsync over ssh in parallel to transfer files between two machines for evaluation purposes. I wanna see how faster can I get compared to a single rsync process.

I tried these two solutions: https://wiki.ncsa.illinois.edu/display/~wglick/Parallel+Rsync but with no great success. https://gist.github.com/rcoup/5358786 (I couldn't make it work)

Based on the first link I run a command like this:

ssh HOST "mkdir -p ~/destdir/basefolder"
cd ./basefolder; ls | xargs -n1 -P 4 -I% rsync -arvuz -e ssh % HOST:~/destdir/basefolder/.

and I get the files transfered, but it doesn't seem to work well... In this case, It will run a process for every file and folder in the basefolder, but when it finds a folder, it will transfer everything inside that folder using only 1 process.

I tried to use find -type f, but I got problems because I loose the file hierarchy.

Does anyone how some methods to do what I want? (Use rsync in parallel over ssh while keeping files and folders hierarchy).


Solution

  • Since you tagged your question 'gnu-parallel' the obvious is to refer you to http://www.gnu.org/software/parallel/man.html#EXAMPLE:-Parallelizing-rsync

    cd src-dir; find . -type f -size +100000 | parallel -v ssh fooserver mkdir -p /dest-dir/{//}\;rsync -Havessh {} fooserver:/dest-dir/{}