Given a __m256i register and an index i, I want to extract a single byte from each value stored in the register and save it in antoher __m256i register. Also after performing some calculation on the second register, I want to load the byte back to the first register without touching the other bytes.
Example:
index i = 2
__m256i a:
3210
|AAAA|AAAA|AAAA|AAAA|AAAA|AAAA|AAAA|AAAA|
__m256i b:
|FAFF|FAFF|FAFF|FAFF|FAFF|FAFF|FAFF|FAFF|
// some calculation
__m256i a:
|A6AA|A6AA|A6AA|A6AA|A6AA|A6AA|A6AA|A6AA|
I am sorry, if this question was asked before, but since I am new to this topic it is quite hard to find answers for this topic. Thank you!
I try to generalize answers above:
const int index = 2; // byte index
__m256i mask = _mm256_set1_epi32(0xFF << index*8); // bit mask |0F00|0F00|...|0F00|
__m256i a; // source vector |AAAA|AAAA|...|AAAA|
__m256i b = _mm256_blendv_epi8(_mm256_set1_epi8(-1), a, mask);// extract byte |FAFF|FAFF|...|FAFF|
b; // after some manipulations |BBBB|BBBB|...|BBBB|
a = _mm256_blendv_epi8(a, b, mask); // store byte |ABAA|ABAA|...|ABAA|