I have a file share with (you guessed it) a lot of files. I want to create a batch job which mounts this file share and the reads in each of the files and processes each one in parallel (each as a batch task).
Is this possible to do with python and in azure batch? Any tutorial showing how to do this would be great.
You can do this in one of two ways. Note that the following only applies to Linux. Windows users will need to follow a slightly different method using User Identities.
mount -t cifs ...
. This will work through reboots as the StartTask is re-run everytime on reboot./etc/fstab
to add an entry to automount. Note that you must make this operation idempotent as the StartTask is re-run everytime on reboot.unmount
the share for cleanup.Make sure, in any path you choose, that proper elevation privileges are given to the task (typically superuser) such that the process can perform the mount or modify /etc/fstab
.
If you go with the first option, the mount will be available all the time to the compute node regardless if a job that requires it or not is run on that node. There are advantages and disadvantages for each approach. Your requirements, be it compliance, or technical (for example) should help you on which to choose.