Search code examples
filesynchronizationcopyrsyncscp

What is a good pattern to synchronize files between computers in parallel (in CentOS)?


Trying to find a good way to copy code between one "deployment" computer and several "target" computers, hopefully in parallel. The idea is that the deployment computer holds a copy of the files as they are supposed to be copied to the target servers. We would like to have copying happen in parallel, as it might involve several tens of target servers.

Our current scheme involves using rsync to synchronize the containing directory where the files reside, in order to keep the target servers up-to-date on the deployment server.

So, the questions are:

  1. What is a good / better way to do this?
  2. What sort of tools are used to do this?
  3. Should this problem be faced from a different angle or perspective that I'm totally missing?

Thanks very much!


Solution

  • Another option is pdsh, a parallel, distributed shell. It's available from EPEL, and allows running remote commands (via ssh) on multiple nodes in parallel. For example:

    pdsh -w node10,node11,node12 command
    

    Runs "command" on all three nodes in parallel. It also has a handy hostname expression feature to do the same thing with a bit less typing:

    pdsh -w node[10-12] command
    

    It also includes the pdcp command copies files to multiple nodes in parallel. (The pdsh package needs to be installed on all nodes for pdcp to work.)

    pdcp -w node[10-12] /local/file /remote/dir/
    

    The local file is copied to the /remote/dir on all three nodes.