I have a micro-controller that does not have an MMU, but we are using C and C++.
We are avoiding all dynamic memory usage (i.e. no new SomeClass()
or malloc()
) and most of the standard library.
Semi-Question 0:
From what I understand std::array
does not use any dynamic memory so its usage should be OK (It is on the stack only). Looking at std::array
source code, it looks fine since it creates a c-style array and then wraps functionality around that array.
The chip we are using has 1MB of flash memory for storing code.
Question 1:
I am worried that the use of templates in std::array
will cause the binary to be larger, which will then potentially cause the binary to exceed the 1MB code memory limit.
I think if you create an instance of a std::array< int, 5 >
, then all calls to functions on that std::array
will occupy a certain amount of code memory, lets say X bytes of memory.
If you create another instance of std::array< SomeObject, 5 >
, then call functions to that std::array
, will each of those functions now be duplicated in the binary, thus taking up more code memory? X bytes of memory + Y bytes of memory.
If so, do you think the amount of code generated given the limited code memory capacity will be a concern?
Question 2:
In the above example, if you created a second std::array< int, 10 >
instance, would the calls to functions also duplicate the function calls in the generated code? Even though both instances are of the same type, int
?
std::array
is considered a zero cost abstraction, which means it should be fairly optimizable by the compiler.
As of any zero cost abstraction, it may induce a small compile time penality, and if the opimizations required te be truely zero cost are not supported, then it may incur a small size or runtime penality.
However, note that compiler are free to add padding at the end of a struct. Since std::array
is a struct, you should check how your platform is handling std::array
, but I highly doubt it's the case for you.
Take this array and std::array
case:
#include <numeric>
#include <iterator>
template<std::size_t n>
int stuff(const int(&arr)[n]) {
return std::accumulate(std::begin(arr), std::end(arr), 0);
}
int main() {
int arr[] = {1, 2, 3, 4, 5, 6};
return stuff(arr);
}
#include <numeric>
#include <iterator>
#include <array>
template<std::size_t n>
int stuff(const std::array<int, n>& arr) {
return std::accumulate(std::begin(arr), std::end(arr), 0);
}
int main() {
std::array arr = {1, 2, 3, 4, 5, 6};
return stuff(arr);
}
Clang support this case very well. all cases with std::array
or raw arrays are handleld the same way:
-O2
/ -O3
both array and std::array
with clang:
main: # @main
mov eax, 21
ret
However, GCC seem to have a problem optimizing it, for bith the std::array
and the raw array case:
-O3
with GCC for array and std::array
:
main:
movdqa xmm0, XMMWORD PTR .LC0[rip]
movaps XMMWORD PTR [rsp-40], xmm0
mov edx, DWORD PTR [rsp-32]
mov eax, DWORD PTR [rsp-28]
lea eax, [rdx+14+rax]
ret
.LC0:
.long 1
.long 2
.long 3
.long 4
Then, it seem to optimize better with -O2
in the case of raw array and fail with std::array
:
-O2
GCC std::array
:
main:
movabs rax, 8589934593
lea rdx, [rsp-40]
mov ecx, 1
mov QWORD PTR [rsp-40], rax
movabs rax, 17179869187
mov QWORD PTR [rsp-32], rax
movabs rax, 25769803781
lea rsi, [rdx+24]
mov QWORD PTR [rsp-24], rax
xor eax, eax
jmp .L3
.L5:
mov ecx, DWORD PTR [rdx]
.L3:
add rdx, 4
add eax, ecx
cmp rdx, rsi
jne .L5
rep ret
-O2
GCC raw array:
main:
mov eax, 21
ret
It seem that the GCC bug failling to optimize -O3
but succeed with -O2
is fixed in the most recent build.
Here's a compiler explorer with all the O2
and the O3
With all these cases stated, you can see a common pattern: No information about the std::array
is outputted in the binary. There are no constructors, no operator[]
, not even iterators, nor algorithms. Everything is inlined. Compiler are good at inlining simple functions. std::array
member functions are usually very very simple.
If you create another instance of std::array< SomeObject, 5 >, then call functions to that std::array, will each of those functions now be duplicated in the binary, thus taking up more flash memory? X bytes of memory + Y bytes of memory.
Well, you changed the data type your array is containing. If you manually add overload of all your functions to handle this additional case, then yes, all those new functions may take up some space. If your function are small, there is a great chance for them to be inlined and take less space. As you can see with the example above, inlining and constant folding may greatly reduce your binary size.
In the above example, if you created a second std::array instance, would the calls to functions also duplicate the function calls in flash memory? Even though both instances are of the same type, int?
Again it depends. If you have many function templated in the size of the array, both std::array
and raw arrays may "create" different function. But again, if they are inlined, there is no duplicate to be worried about.
Both will a raw array and std::array
, you can pass a pointer to the start of the array and pass the size. If you find this more suitable for your case, then use that, but still raw array and std::array
can do that. For raw array, it implicitly decays to a pointer, and with std::array
, you must use arr.data()
to get the pointer.