Search code examples
condoremcisilon

How to set limit to user throughput for isilon storage


Users could run multiple processes on condor and access isilon storage. Some could abuse the read throughput at expenses of others. So let say isilon could handle 10GB/S read speed. If I have 3 users with 100 processes each trying to read 1GB/S then the rest would be significantly starved. What kind of solutions exist? Single host limits do not work since user read through condor.


Solution

  • There are a couple of ways to do this in HTCondor, depending on the nature of your jobs and your system.

    First, you can use the Concurrency Limits feature to globally limit the number of running jobs across all users. Each job will need to declare that it is using some percentage of the file server bandwidth. In the central manager configuration you can set the limit, say

    IPSILON_LIMIT = 1000

    Then, each job can declare that they use some amount of that bandwidth by adding to their job description

    concurrency_limits = ipsilon:100

    declaring that this job will use 100 of the available 1000 total bandwidth.

    A second way can work when you files can be transferred in once from the remote file server to a local scratch directory, operated on locally, and transferred back when the job is done. If this fits your usage model, you can look at the custom file transfer plugins. HTcondor will then copy the files from the server to a local scratch directory, but the number of those active transfers can be limited per schedd.