Checking the SHL x86 assembly instruction for overflow

What is an easy way to check after the SHL EAX, CL instruction whether the theoretical result (EAX times (2 power CL)) fits into EAX? The question is about unsigned integers.

I hoped for something like checking the carry flag in the case of ADD instructions or checking EDX in the case of MUL instructions.

Solution

You can't do this by checking after the instruction. It does not record information about unsigned overflow, i.e. whether any 1 bits were shifted out. The carry flag only contains the last bit that was shifted out.

For instance, if EAX = 0xF0000000 and CL = 7, then SHL EAX, CL will leave EAX = 0 and the carry flag clear. You would get exactly the same architectural result, including all the other flag values, if the input was EAX = 0x00000000. So even though one has an overflow and the other does not, there is no way to distinguish them after the fact.

You can test before the instruction whether it will overflow, by checking if the top CL bits of EAX are zero. For instance (untested):

    MOV    EBX, 0x80000000
    SAR    EBX, CL    ; now top CL+1 bits of EBX are 1
    SHL    EBX, 1     ; now top CL bits of EBX are 1
    TEST   EBX, EAX   ; mask off all lower bits
    JNZ    will_overflow

Some other possible algorithms (maybe better) were suggested in the comments, such as SHLD.

Just because I happened to be thinking about it, here's a check for signed overflow, which occurs if and only if the top CL+1 bits are not all equal.

    MOV    EBX, 0x80000000
    SAR    EBX, CL      ; now top CL+1 bits of EBX are 1
    MOV    EDX, EAX     ; copy the input to be shifted
    AND    EDX, EBX     ; mask off all but top CL+1 bits
    JZ     no_overflow  ; top bits all 0
    CMP    EDX, EBX     
    JE     no_overflow  ; top bits all 1
    ; else handle overflow

For the one-bit shift instruction SHL EAX, 1, then the carry flag does exactly indicate whether unsigned overflow occurs. Also, the overflow flag indicates whether signed overflow occurs. In fact, all of CF, OF, ZF, SF, PF are set exactly the same as they would be for the mathematically equivalent ADD EAX, EAX. So if you are optimizing for space over speed, then instead of SHL EAX, CL, you could consider a loop, iterating CL times and executing SHL EAX, 1 followed by JC overflow_occurred on each iteration.