When executing the following code I get a Stack Smashing
error.
const uint size = 62;
...
for (int i=0; i < 10; ++i){
// mask = elements != zero
// input = epi32 m512 data containing 1 byte values
_mm512_mask_compress_epi32(input, mask, input);
// get just elements != 0 as previous mask.
__mmask16 mask1 = _mm512_cmpneq_epi32_mask(compressed, _mm512_setzero_epi32());
// append the non-zero elements to the uchar*
_mm512_mask_cvtusepi32_storeu_epi8((uchar*)str+pos, mask1, compressed); // uncommenting = no error, truncating mask = no error
// add size of the inserted elements by counting 1's in mask
pos += sizeOfInsertion;
// print the position of the pointer AFTER storing
void* pp = (void*) ((uchar*) str + pos);
std::cout << pp << std::endl;
}
To investigate this issue, I was checking the position of the pointer while inserting the elements.
At beginning (pointing to str[0])
I have 0x7ffce3468d30
, at the end 0x7ffce3468d69
. Subtracting these addresses I get 3E = 62
. So it should fit inside the declared array.
Shifting the mask by 1 (truncating one element), it doesn't throw an error.
The failure was in the compression. I didn't mind zeroing the values not matching the mask, so data wasn't stored contiguously and the stack was therefore overflowing.
In short:
_mm512_maskz_compress_epi32(mask, input);
made it work.