What is an easy way to check after the SHL EAX, CL
instruction whether the theoretical result (EAX times (2 power CL)) fits into EAX? The question is about unsigned integers.
I hoped for something like checking the carry flag in the case of ADD
instructions or checking EDX in the case of MUL
instructions.
You can't do this by checking after the instruction. It does not record information about unsigned overflow, i.e. whether any 1 bits were shifted out. The carry flag only contains the last bit that was shifted out.
For instance, if EAX = 0xF0000000 and CL = 7, then SHL EAX, CL
will leave EAX = 0 and the carry flag clear. You would get exactly the same architectural result, including all the other flag values, if the input was EAX = 0x00000000. So even though one has an overflow and the other does not, there is no way to distinguish them after the fact.
You can test before the instruction whether it will overflow, by checking if the top CL bits of EAX are zero. For instance (untested):
MOV EBX, 0x80000000
SAR EBX, CL ; now top CL+1 bits of EBX are 1
SHL EBX, 1 ; now top CL bits of EBX are 1
TEST EBX, EAX ; mask off all lower bits
JNZ will_overflow
Some other possible algorithms (maybe better) were suggested in the comments, such as SHLD
.
Just because I happened to be thinking about it, here's a check for signed overflow, which occurs if and only if the top CL+1 bits are not all equal.
MOV EBX, 0x80000000
SAR EBX, CL ; now top CL+1 bits of EBX are 1
MOV EDX, EAX ; copy the input to be shifted
AND EDX, EBX ; mask off all but top CL+1 bits
JZ no_overflow ; top bits all 0
CMP EDX, EBX
JE no_overflow ; top bits all 1
; else handle overflow
For the one-bit shift instruction SHL EAX, 1
, then the carry flag does exactly indicate whether unsigned overflow occurs. Also, the overflow flag indicates whether signed overflow occurs. In fact, all of CF, OF, ZF, SF, PF are set exactly the same as they would be for the mathematically equivalent ADD EAX, EAX
. So if you are optimizing for space over speed, then instead of SHL EAX, CL
, you could consider a loop, iterating CL times and executing SHL EAX, 1
followed by JC overflow_occurred
on each iteration.