After starting to use gcc 11
of Ubuntu 22.04
I've noticed I have ~90% degradation in my c
application performance - the way I measure it.
Narrowing it I saw the degradation happens since gcc 8.4.0-3ubuntu2
.
Now I'm on Ubuntu 22.04
using gcc-7
and gcc-8
(and gcc
, which is gcc 11
).
Compiling the exact same code with gcc-7
has good results, while compiling with gcc-8
(or gcc 11
) results in slower application.
I did not find any changes that should matter in gcc 8 changes.
I don't have a simple application. If I had it means I already know the source of this issue.
Any suggestions?
Was something changed since gcc 7.5
to gcc 8.4
?
** Edit ** - after gprof
of old-fast
(using gcc-7
) and new-slow
(using gcc-8
) - I think the most valuable thing I see, is that on the new-slow
version there's this entry, on the second place of Flat profile
:
Flat profile:
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls s/call s/call name
39.27 9.83 9.83 173488 0.00 0.00 main_function
22.89 15.56 5.73 ...
...
Ok then,
This was the case:
For some reason gcc-7
did not care about it, but since gcc-8
it became an issue.
As you can see, I had a big array instantiation on the stack of main_function()
.
sizeof(my_big_struct) -> 100
Pseudo-code:
void main_function() {
my_big_struct bigstruct_arr[20000];
...
}
gcc-7
ran without any problemsgcc-8
(and 11
) ran as well, but really slow. I'm not sure why. Too much time for allocation? Or array access?As you can see from perf
, it says exactly that main_funcion()
is the problematic one.
It is a bit misleading because an address 0x5594faaa3090
takes all the fault.
I did not understand what this address meant, until I did, that it's that array bigstruct_arr
.
Samples: 36K of event 'cycles', Event count (approx.): 13441606624
Children Self Command Shared Object Symbol
- 90.47% 88.88% trd_1 my_process [.] main_function
+ 88.26% 0x5594faaa3090
+ 1.60% main_function
0.61% 0
The solution was of course, defining it global or with malloc