c++c performance optimization processing-efficiency

Cost of changing a value vs. accessing an array in C

This question was closed for being opinion based, so this is an edit to clarify what I meant by it.

Is there any way to make an educated guess regarding whether changing the value of a double will take more or less time than retrieving a double from an array? I understand that what is faster may be situational, the question is if there is any way to predict what is the faster method in a given situation. Alternatively if there is any "good practice" one should adhere to such that the compiler can do as much optimisation as possible.

This question is based on the knowledge that the time required to access a given piece of data depends on whether it is located in L1, L2, L3 (...) or RAM. Due to limited space in L1, L2, ... I would believe that it is marginally faster to repeatedly modify a single variable than to modify many different variables once. However, I have no idea how big the difference is, or if it is possible to predict/manipulate what data/instructions will be located in what cache/RAM.

Below is the question as it was originally stated:

The time an operation takes is (to my best knowledge) related to what memory cache the information you are using is stored. So i am wondering whether it may be more efficient to change the value of a double 2N times rather than store N doubles in an array and then iterate over the array. The thought is that the variable being changed frequently will be stored in a lower level cache, so that it will be accessed marginally faster than the values stored in the array. The array is small enough that the entire array fits in RAM, the point is not to free up memory.

Example code of the two alternatives is shown below. Note that the computations here are simplified to better describe the essence of the question. In reality the arrays are two dimensional, and the computation of tmp1 and tmp2 are slightly larger, but is still only a simple dependency on the index:

#define DIM 1000
double states[DIM];
double time_derivatives[DIM];
double ambient_state = 3.0;
    
// Initialize states
for (int pos = 0; pos < DIM; pos++) {
    states[pos] = pos;
}

// Alternative 1
double tmp1;
double tmp2;

// Ends
tmp1 = 1;
tmp2 = 2;
time_derivatives[0] = (ambient_state - states[0]) * tmp1 + (states[1] - states[0]) * tmp2;
tmp1 = DIM;
tmp2 = DIM + 1;
time_derivatives[DIM - 1] = (ambient_state - states[DIM - 1]) * tmp2 + (states[DIM - 2] - states[DIM - 1]) * tmp1;

// Bulk
for (int pos = 1; pos < DIM - 1; pos++) {
    tmp1 = pos + 1;
    tmp2 = pos + 2;
    time_derivatives[pos] = (states[pos - 1] - states[pos]) * tmp1 + (states[pos + 1] - states[pos]) * tmp2;
}
    
// Alternative 2
double flows[DIM + 1];
double tmp1; //Some intermediate, neccesary calculation variable

// Flows at ends
tmp1 = 1;
flows[0] = (states[0] - ambient_state) * tmp1;
tmp1 = DIM;
flows[DIM] = (ambient_state - states[DIM - 1]) * tmp1;

// Flows in bulk
for (int pos = 1; pos < DIM; pos++) {
    tmp1 = pos + 1;
    flows[pos] = (states[pos] - states[pos - 1]) * tmp1;
}
// Compute time derivatives
for (int pos = 0; pos < DIM; pos++) {
    time_derivatives[pos] = flows[pos + 1] - flows[pos];
}

In alternative 1, a lot of calculations are "repeated" in the final for-loop as (states[pos + 1] - states[pos]) * tmp1 in one iteration will equal - (states[pos - 1] - states[pos]) * tmp2 the next iteration. In alternative 2, all the differences are calculated and stored in the array flows, thereby reducing the total number of computations.

The question is essentially, what is the cost of a computational operation compared to the cost of storing and accessing a variable in an array? Are there limiting cases for when one will be more efficient than the other?

Solution

It's true you can't know without measuring, but you run the risk of either measuring wrong, or not measuring some future computer.

Remember too that you could easily be measuring the wrong thing. Programmer time is usually a lot more expensive than machine time. Guessing — even guessing wrong — could be the best strategy, because it's quick.

So here's a basis for a quick guess.

About 20 years ago I worked on Monte-Carlo simulation system, something that requires lots of random numbers. We spent weeks evaluating random number generators to pick one that introduced the least bias into our model. Then we stored those numbers in an array, and used that array throughout our process.

About 10 years later we had cause to revisit that process, IIRC because we needed more numbers. Along the way, we noticed that the array wasn't helping: it was quicker to call the RNG function each time we needed a number than to use the pre-generated array. By a lot.

Random-number generation is a surprisingly complicated business with quite a bit of computation attached. But it's a small algorithm, hardly a page of code.

The lesson I took away is that computation is cheap and cache memory is not. I use that as a basis for my guesses all the time. Feel free to do the same.