How "bitwise AND mask equals mask" can be optimized?

Example:

bool foo(uint64_t x)
{
      return (x & 0x7ff0000000000000) == 0x7ff0000000000000;
}

leads to (ARM 32-bit):

gcc 12.1 (linux) -O3:
f:
        movs    r3, #0
        movt    r3, 32752
        bics    r3, r3, r1
        ite     eq
        moveq   r0, #1
        movne   r0, #0
        bx      lr

armv7-a clang 11.0.1 -O3:
f:
        mov     r0, #267386880
        orr     r0, r0, #1879048192
        bic     r0, r0, r1
        rsbs    r1, r0, #0
        adc     r0, r0, r1
        bx      lr

Can the C code above be rewritten in such a way that a faster ASM code is produced?

Perhaps there are relevant bit twiddling hacks? Or their combinations? Or similar?

Solution

One option is

bool foo4(uint64_t x)
{
    return (((x << 1) >> 53) + 1) >> 11;
}

which compiles with gcc to

foo:
        ubfx    r0, r1, #20, #11
        adds    r0, r0, #1
        ubfx    r0, r0, #11, #1
        bx      lr

The saving here mostly comes from not having to convert to a 0/1 result but generating an 1 bit directly. If this function is inlined and the result is used for a branch, this is not helpful and might actually result in slower code.