I am trying to write some c++ code which is crucial for performance. Therefor I am using AVX intrinsics and need to align my data to 32 byte.
I am using a struct which look similar to this: (I commented out parts of it to track down the issue)
struct Summation {
alignas(ALIGNMENT) float summation[HIDDEN_SIZE] {};
Summation() {
// std::memcpy(summation, inputBias, sizeof(float) * HIDDEN_SIZE);
}
Summation& operator=(const Summation& other) {
// std::memcpy(summation, other.summation, sizeof(float) * HIDDEN_SIZE);
return *this;
}
};
struct Evaluator {
Evaluator(){}
// inputs and outputs
bool inputMap[INPUT_SIZE] {};
// bias
alignas(32) float input_bias[HIDDEN_SIZE] {};
alignas(32) float hidden_bias {};
// weights
alignas(32) float input_weights[INPUT_SIZE][HIDDEN_SIZE] {};
alignas(32) float hidden_weights[HIDDEN_SIZE] {};
alignas(32) float activation[HIDDEN_SIZE] {};
std::vector<Summation> summations {};
Compiling this works with no issues and running the following works without any problems:
nn::Evaluator ev1{};
The issue arises when I try to create a second Evaluator:
nn::Evaluator ev1{};
nn::Evaluator ev2{};
-->
Process finished with exit code -1073741571 (0xC00000FD)
I tracked down the problem to the creation of
alignas(32) float input_weights[INPUT_SIZE][HIDDEN_SIZE] {};
Yet I do not know why this would cause problems when creating a second Evaluator object but works with only a single Evaluator. I am very happy for any help.
I have found the issue already. It turns out its not the alignment but the allocation on the stack. Since the 2d-array is very large compared to the other data contained, it allocates too much memory on the stack and causes a stackoverflow.