Search code examples
performanceregressionpredictionforecasting

Predicting system performance - method for extrapolating multivariate performance metrics into perdictive equation


I have a reporting application. Its performance is dependent on the hardware it is hosted on and the data it runs against. So under hardware, the main factors are:

  • CPU cores
  • Memory
  • Hard disk speed

.. and under data, the main factors are:

  • Number of customers
  • The average amount of data each customer has generated

My plan is to run a series of tests to measure the performance when I alter a single factor. So, for example, I will run the performance tests against 1 core, 2 cores and 4 cores and then run the tests against 4GB RAM, 16GB RAM and 64GB RAM.

From these measurements I would like to produce a formula that can roughly predict how well a system will perform given certain hardware and data.

For example:

Performance Score = f(cpu) + g(mem) + h(disk) + j(cust) + k(data)

where f, g, h, j and k are functions of the parameter they are passed.

My question is:

Is there a formal method for taking performance metrics as an input and extrapolating that data to produce a formula that predicts performance?


Solution

  • Yes - I would use linear regression as a starting point.

    For an example, see How can I predict memory usage and time based on historical values.

    I found Data Analysis Using Regression and Multilevel/Hierarchical Models to be s highly readable introduction to the subject (you probably won't need multilevel models, so you can skip the second part of the book).