Having a problem where unable to run remote GNU Parallel jobs for a parallel
command that runs jobs on a cluster of nodes when the varying arg is based on a list of files in a directory output by a glob pattern.
The command look like
bcpexport() {
<do some stuff to the given file arg $1 to BCP copy file contents to some MSSQL Server>
}
export -f bcpexport
parallel -q -j 10 --sshloginfile /path/to/list/of/nodes.txt --env $bcpexport \
bcpexport {} "$TO_SERVER_ODBCDSN" $DB $TABLE $USER $PASSWORD $RECOMMEDED_IMPORT_MODE $DELIMITER \
::: "$DATAFILES/$TARGET_GLOB"
When run on a single node, things work fine; the "$DATAFILES/$TARGET_GLOB"
glob pattern has the form /path/to/a/set/of/files/*.tsv
(which exists as a NFS link to a shared file system (which I can confirm to be accessible from all of the other nodes)). However, when using the --sshloginfile
option to execute remotely on other nodes, see the error
/bin/bash: line 27: /path/to/a/set/of/files/*.tsv: No such file or directory
as if the function is getting the glob pattern itself as a filename (rather than a filename from a list of files returned by the glob (as it appears to behave when running in single-node mode)).
If anyone knows what's going on here, advice and suggestions would be appreciated.
Found that the problem was that when using the -q
option (which was being used in the command to bring in the "$TO_SERVER_ODBCDSN"
arg into the parallel job without splitting the string variable which had spaces in it). Unquoting the "$DATAFILES/$TARGET_GLOB"
glob pattern variable to just be $DATAFILES/$TARGET_GLOB
solved the problem.