I am struggling to find a comprehensive report about a general HPC cluster's average hardware utilization. There are various data-sets available from Google, or Facebook about their cloud's hardware utilization, but are there any similar report/data-set that I can cite or look into from a HPC center.
My focus is to see how dynamic and long tailed jobs would suffer if they run through coarse grain resource managers like SLURM, or Torque. I am aware that both these resource managers support fine grained execution but they do not provide as comprehensive API as resource managers like Mesos, or Yarn.
Not many HPC centres publish detailed, public reports of their usage. The exception has generally been the UK national HPC facilities which provide a huge amount of data on their historical use.
The current service, ARCHER, publishes monthly and quarterly data (including usage) from 2014 to current date at:
http://www.archer.ac.uk/about-archer/reports/
The previous service, HECToR has similar data available from 2007-2014 at:
http://www.hector.ac.uk/about-us/reports/
and the service before that, HPCx, has data from 2002-2010:
http://www.hpcx.ac.uk/projects/reports/
This should give you around 15 years worth of data to examine!