Search code examples
data-structuresdynamic-arraysspace-complexity

Is there a mathematical way to calculate the optimal growth factor for a dynamic array?


For a class project we need to find the optimal growth factor of a priority queue based on cyclic dynamic array. I've ran some tests but they were inconclusive. We were told there is a mathematical way of showing it but I couldn't find it anywhere.

I've looked up a bit and found that Java ArrayList uses 3/2, theres also a c implementation for pyhton's list that uses a 9/8 factor and basically I think it differs between the different programming languages.

So is there any mathematical way of finding the 'optimal' growth factor for a dynamic array?


Solution

  • The short answer is “No”.

    For most applications, the optimal growth factor would be the one which grows the array exactly to its maximally required size in a single step. So it is easy to determine in hindsight, but usually hard to predict.

    So lacking information about that maximal size, the strategy which optimizes memory requirements is the one which allocates a single additional cell in each step. The one which optimizes the number of reallocations is the one which allocates as much memory as could possibly be needed in a single step. For most practical applications, neither extreme is reasonable. Usually a tradeoff between memory and reallocations is desired.

    You could phrase your preferences with regard to that tradeoff in mathematical terms. E.g. by formulating some cost as a function of the number or frequency of allocations, the amount of total allocated space and the percentage actually used. If in addition to this you had a mathematical model of the typical usage pattern, then you could start optimizing, i.e. vary growth factor to minimize cost.

    In practice, neither cost nor usage are that well know, as both depend on a lot of external factors. therefore it all boils down to gut feeling and experience. If you notice your app wasting memory, you might decrease the growth factor, and if you notice it spending too much time on reallocations, you might increase the factor. Both changes would be pretty ad-hoc, without too much mathematical modeling involved.