Search code examples
cloudmesoshpcslurm

HPC job studies and hardware utilization report


I am struggling to find a comprehensive report about a general HPC cluster's average hardware utilization. There are various data-sets available from Google, or Facebook about their cloud's hardware utilization, but are there any similar report/data-set that I can cite or look into from a HPC center.

My focus is to see how dynamic and long tailed jobs would suffer if they run through coarse grain resource managers like SLURM, or Torque. I am aware that both these resource managers support fine grained execution but they do not provide as comprehensive API as resource managers like Mesos, or Yarn.


Solution

  • Not many HPC centres publish detailed, public reports of their usage. The exception has generally been the UK national HPC facilities which provide a huge amount of data on their historical use.

    The current service, ARCHER, publishes monthly and quarterly data (including usage) from 2014 to current date at:

    http://www.archer.ac.uk/about-archer/reports/

    The previous service, HECToR has similar data available from 2007-2014 at:

    http://www.hector.ac.uk/about-us/reports/

    and the service before that, HPCx, has data from 2002-2010:

    http://www.hpcx.ac.uk/projects/reports/

    This should give you around 15 years worth of data to examine!