Search code examples
c++performanceoperator-overloadingnumerical-computing

Efficiency of manually written loops vs operator overloads


in the program I'm working on I have 3-element arrays, which I use as mathematical vectors for all intents and purposes.

Through the course of writing my code, I was tempted to just roll my own Vector class with simple arithmetic overloads (+, -, * /) so I can simplify statements like:

// old:
for (int i = 0; i < 3; i++)
    r[i] = r1[i] - r2[i];

// new:
r = r1 - r2;

Which should be more or less identical in generated code. But when it comes to more complicated things, could this really impact my performance heavily? One example that I have in my code is this:

Manually written version:

for (int j = 0; j < 3; j++)
{
    p.vel[j] = p.oldVel[j] + (p.oldAcc[j] + p.acc[j]) * dt2 + (p.oldJerk[j] - p.jerk[j]) * dt12;
    p.pos[j] = p.oldPos[j] + (p.oldVel[j] + p.vel[j]) * dt2 + (p.oldAcc[j] - p.acc[j]) * dt12;
}

Using the Vector class with operator overloads:

p.vel = p.oldVel + (p.oldAcc + p.acc) * dt2 + (p.oldJerk - p.jerk) * dt12;
p.pos = p.oldPos + (p.oldVel + p.vel) * dt2 + (p.oldAcc - p.acc) * dt12;

I am attempting to optimize my code for speed, since this sort of code runs inside of inner loops. Will using the overloaded operators for these things affect performance? I'm doing some numerical integration of a system of n mutually gravitating bodies. These vector operations are extremely common so having this run fast is important.

Any insight would be appreciated, as would any idioms or tricks I'm unaware of.


Solution

  • If the operations are inlined and optimised well by your compiler you shouldn't usually see any difference between writing the code well (using operators to make it readable and maintainable) and manually inlining everything.

    Manual inlining also considerably increases the risk of bugs because you won't be re-using a single piece of well-tested code, you'll be writing the same code over and over. I would recommend writing the code with operators, and then if you can prove you can speed it up by manually inlining, duplicate the code and manually inline the second version. Then you can run the two variants of the code off against each other to prove (a) that the manual inlining is effective, and (b) that the readable and manually-inlined code both produce the same result.

    Before you start manually inlining, though, there's an easy way for you to answer your question for yourself: Write a few simple test cases both ways, then execute a few million iterations and see which approach executes faster. This will teach you a lot about what's going on and give you a definite answer for your particular implementation and compiler that you will never get from the theoretical answers you'll receive here.