Search code examples
cbit-manipulationswar

How to write a SWAR comparison which puts 0xFF in a lane on matches?


I'm trying to write a SWAR compare-for-equality operation, working on uint64_t pretending to be 8 'lanes' of uint8_t. The closest I've managed to achieve, based on techniques in Hacker's Delight and Bit Twiddling Hacks, is the following:

uint64_t compare_eq (uint64_t x, uint64_t y) {
    uint64_t xored = x ^ y;
    uint64_t mask = 0x7F * 0x0101010101010101ULL;
    uint64_t tmp = (xored & mask) + mask;
    return ~(tmp | xored | mask);
}

However, this puts 0x80 into 'lanes' which match, and 0x00 into 'lanes' that don't, whereas I want 0xFF in 'lanes' that match, and 0x00 in 'lanes' that don't. Is it possible to write this without branching?


Solution

  • For the record, this is just a variant for calculating a high bit in nonzero bytes (one instruction less) put together with the comments from @njuffa and @Nate Eldredge (probably slightly more efficient than in 4386427's answer).

    uint64_t compare_eq (uint64_t x, uint64_t y) {
        uint64_t xored = x ^ y;
        uint64_t mask = ((((xored >> 1) | 0x8080808080808080) - xored) & 0x8080808080808080);
        return (mask << 1) - (mask >> 7);
    }