I am needing to figure out a bit-hack to find the first location where the bit-value changes. It could be from 0
to 1
or 1
to 0
.
// This would give 0 since no changes happen
let x1 = 0b00000000
// This also gives 0 since there is no change
let x2 = 0b11111111
// This would give 1 since the value changes after the first bit
let x3 = 0b10000000
// This would also give 1 since the value change
// happens at the first bit
let x4 = 0b01111111
// This would return 7 since the change is at the 7th bit
let x5 = 0b00000001
// This would also return 7
let x6 = 0b11111110
Any recommendations on an incantation that would give these results?
The key building-block is a bit-scan like bsr
. (Or 31-lzcnt
).
That finds the position of the highest 1
bit, and we can transform your input to work for this.
If the leading bit is set, NOT it before bit-scan. Like x = ((int32_t)x<0) ? ~x : x;
with mov ecx, eax
/ xor eax, -1
(like NOT but sets FLAGS) / cmovns ecx, eax
. So like an absolute-value idiom, but with NOT instead of NEG . Another option is sar reg,31
(or cdq
) / xor
with that to flip or not.
#include <bit> // C++20
#include <stdint.h>
int transition_position(uint32_t x)
{
x = ((int32_t)x<0) ? ~x : x; // make the leading bits zero
if (x == 0) __builtin_unreachable(); // optional, x|=1 is another way to handle the case where all bits are the same, allowing BSF without branching
return std::countl_zero(x) ^ 31; // 31-lzcnt in a way that GCC can optimize back to bsf
}
This compiles nicely with GCC / clang: https://godbolt.org/z/63nEcnTso
gcc11.2 and clang13 -O3 both make this asm
transition_position(unsigned int):
mov eax, edi
sar eax, 31 # missed optimization: should shift EDI so mov is off the critical path on CPUs without mov-elimination
xor eax, edi
bsr eax, eax
ret
With -march=haswell
, they both de-optimize by using lzcnt
and then an actual xor eax,31
. (That asm is still better for AMD, where bsf
is significantly slower than tzcnt
/lzcnt
, so it's good for -mtune=generic -mbmi
, but not for -march=haswell
)