I'm quite new to assembly
Consider the following function:
where the '+' stands for 'Or' logic gate and the concatenation of variables, stands for 'And' logic gate.
How can I implement such a function in emu8086? given that the input parameters may stand for the bits of a register
AL, for example, the output would be changing its value into 0 or 1.
UPDATE:
here's what I did, I know it's bad written and it does seem work if there is any suggestion or an easier way let me know
thanks all for your help especially peter.
org 100h
mov al, 0A3h pour le test
mov ah, al
and ah, 01h ;ah = x0
shr al, 1
mov bl, al
and bl, 01h ;bl = x1
shr al, 1
mov bh, al
and bh, 01h
not bh
and bh, 01h ;bh = !x2
shr al, 1
mov cl, al
and cl, 01h
not cl
and cl, 01h ;cl = !x3
shr al, 1
mov ch, al
and ch, 01h
not ch
and ch, 01h ;ch = !x4
shr al, 1
mov dl, al
and dl, 01h ;x5 = dl
shr al, 1
mov dh, al
and dh, 01h
not dh
and dh, 01h ;dh = !x6
shr al, 1 ;al = x7
and bh, dl
and bh, cl
and bh, ah ;!x2 and x5 and !x3 and x0
and dh, bl
and dh, ch
and dh, al ;!x6 and x1 and !x4 and x7
or dh, bh
mov ah, dh ;resultat dans ah
ret
Are you sure that expression with four x_n
values next to each other is supposed to be bitwise AND, not concatenating them into 4-bit values? And then a binary add? Because I might have guessed that instead. If so, see https://codegolf.stackexchange.com/a/203610 for a shift and rcl reg, 1
way to split bits up between a pair of registers. Or on a modern x86 with BMI2, you could use 2x pext
and add
to do that.
The fact that the expressions have the bits in each group in a specific order that isn't just ascending or descending is probably a clue that they want you to unpack the byte into two 4-bit integers and do a normal +
with it.
The rest of this answer is about optimizing the operations in your asm which does two groups of ANDs, and ORs that result together into a single boolean value, producing a 0
or 1
in AL.
There are some improvements you can make to your simplistic / straightforward implementation that just extracts each bit separately. e.g. you don't need to AND before and after you NOT. The first AND is going to leave the high bits all 0, then NOT makes them 1, then the second AND makes them zero again.
mov bh, al
; and bh, 01h ; This is pointless
not bh
and bh, 01h ;bh = !x2
You can take this farther: You're purely using bitwise operations and only care about the low bit in each register. You can and al, 1
once at the end to isolate the bit you want, with all the temporaries carrying around garbage in their high bits.
To flip some bits but not all, use XOR with a constant mask. e.g. to flip bits 6,4,3,2 in AL and leave the others unmodified, use xor al, 01011100b
1. Then you can shift and mov to separate registers without needing any NOT instructions.
Footnote 1: The trailing b
indicates base 2 / binary. That works in MASM syntax, IDK if emu8086 supports it or if you'd have to write the equivalent hex.
And you can AND right into those registers instead of extracting first, so you only need two scratch regs.
xor al, 01011100b ; complement bits 6,4,3,2
mov cl, al ; x0, first bit of the 2&5&3&0 group
shr al, 1
mov dl, al ; x1, first bit of the 6&1&4&7 group
shr al, 1
and cl, al ; AND X2 into the first group, X2 & x0
shr al, 1
and cl, al ; cl = X2 & X3 & x0
... ; cl = 2&5&3&0, dl = 6&1&4 with a few more steps
shr al, 1 ; AL = x7
and al, dl ; AL = x6 & x1 & x4 & x7 (reading 6,1,4 from dl)
or al, cl ; logical + apparently is regular (not exclusive) OR
and al, 1 ; clear high garbage
ret
(With plain ASCII comments, I've ignored the "complement" part because we handle it all with one instruction at the start.)
That's as far as I can see us going with a straightforward implementation that just gets the bits to the bottom of a register and does each boolean operation (other than complement) with a separate asm instruction.
To do better, we need to take advantage of the 8 (or 16) bits in a register we can do in parallel with one instruction. We can't easily shuffle bits to make them line up with each other because the pattern is irregular.
IDK if there's anything clever we can do left-shifting AX to get bits from AL into the bottom of AH, as well as grouping some at the top of AL. Hmm, maybe alternate shl ax
with rol al
to send bits back to the bottom of AL. But that still needs 7 shifts to separate the bits. (shl ax,2
and rol al,2
for the contiguous bits that go together (7,6 and 3,2) are only available on 186, and putting a count in CL is barely worth it).
The more likely angle of attack is FLAGS: most ALU operations update FLAGS according to the result, with ZF being set to 1 if all bits in a result are 0, otherwise to 1. This gives us a horizontal OR operation across bits in one register. Since !(a | b)
= !a & !b
, we can invert the non-complemented bits in the input to use that as a horizontal AND instead of OR. (I'm using !
for a single bit invert. In C, !
is a logical not that turns any non-zero number into 0, unlike ~
bitwise NOT.)
But unfortunately 8086 doesn't have an easy way to turn ZF into a 0/1 in a register directly. (386 adds setcc r/m8
, e.g. setz dl
sets DL = 0 or 1 according to ZF.) That is possible for CF. We can get CF set according to a register being non-zero by using sub reg, 1
, which sets CF if the reg was 0 (because a borrow comes out the top). Otherwise it clears CF. We can get a 0 / -1 in a reg according to CF with sbb al, al
(subtract with borrow). The al-al part cancels, leaving 0 - CF
.
To set up for using FLAGS, we can use AND masks to separate the bits into their two groups.
;; UNTESTED, I might have some logic inverted.
xor al, 10100011b ; all bits are the inverse of their state in the original expression.
mov dl, al
and dl, 11010010b ; ~x[7,6,4,1]
and al, 00101101b ; ~x[5,3,2,0]
cmp dl, 1 ; set CF if that group was all zero (i.e. if the original AND was 1), else clear
sbb dl, dl ; dl = -1 or 0 for the first group
cmp al, 1
sbb al, al ; al = -1 or 0 for the second group. Fun fact: undocumented SALC does this
or al, dl ; The + in the original expression
and al, 1 ; keep only the low bit
ret
There's probably even more we could do, like maybe and al, dl
to clear the bits in AL or not, according to the SBB result in DL. Or maybe adc al, -1
instead of cmp al, 1
to use the CF result from DL to affect how CF gets set from AL.
Instead of subtracting 1
, you could sub dl, 11010010b
with the AND mask you used, so you get 0
if they were all set, otherwise it wraps and you set CF. Not sure if that's useful.
The amount of negations / inversions quickly gets tricky to do in your head, but if every byte of code-size or every cycle of performance matters then it's something you should look into. (These days that's rarely the case, and when it is you're often vectorizing with SSE2 or AVX so you wouldn't have flags, just bitwise within vector elements and packed compare that turns a match into all-ones and a non-match into 0.)
Note that after splitting up with mov/AND, neither AL nor DL can be all-ones, so adding 1
can never wrap to zero. So maybe sbb al, -1
could add 0 or 1 and could set ZF?
If you want to branch, branching on ZF is fine with jz
or jnz
. That might even be best on 8086, e.g. if the first AND group gives a 1
, we don't need to isolate the other group. So xor al, ...
to complement bits accordingly, then test al, mask1
/ jnz check_other_group
/ mov al,1
would be a good fall through fast path.