Search code examples
amazon-web-serviceshpcslurm

What mechanism does Slurm use to sync files between compute nodes and the master node? Is it encrypted?


I've setup an High Performance Cluster on AWS similar to the one described in this blog post. The resulting cluster has one master that spins up one compute node.

Consider the following file (saved as test_slurm.sh):

#!/bin/bash
#
#SBATCH --job-name=test
#SBATCH --output=res.txt
#
#SBATCH --ntasks=1
#SBATCH --time=10:00

ip a > file.txt

When I run: sbatch test_slurm.sh from the master node, a new file.txt pops up in the same directory with IP information matching the compute node. If I ssh into the compute node, the file is available there as well.

It seems to me that the compute node executes the content of test_slurm.sh, saves a file in its file system and somehow syncs that with the master node. What mechanism is responsible for the file sync? Are the files synced in this manner encrypted in transit?


Solution

  • I asked a similar question on Amazon forums: https://forums.aws.amazon.com/message.jspa?messageID=968147

    As identified by damienfrancois "Slurm will not make any effort to transfer files to/from compute nodes, except for the submission script." AWS parallel cluster sets up default file sharing using NFS as the synchronizing mechanism.

    NFS is setup without additional configuration. This means that encryption in transit is not currently supported.