I'm studying assembly language (for the 68000 microprocessor), and I came across the following problem:
Write a 68000 assembly language program that will perform 5*X + 6*Y+ [Y/8] -> [D1.L], where x is an unsigned 8-bit number stored in the lowest byte of D0 and Y is a 16-bit signed number stored in the upper 16 bits of D1. Neglect the remainder of Y/8.
And this is the solution:
ANDI.W #$OOFF,DO ;CONVERT X TO UNSIGNED 16-BIT
MULU.W #5,DO ;COMPUTE UNSIGNED 5*X IN D0.L
SWAP.W D1 ;MOVE Y TO LOW 16 BITS IN D1
M0VE.W D1 ,D2 ;SAVE Y TO LOW 16 BITS OF D2
MULS.W #6,D1 ;COMPUTE SIGNED 6*Y IN D1.L
ADD.L DO,D1 ;ADD 5*X WITH 6*Y
EXT.L D2 ;SIGN EXTEND Y TO 32 BITS
ASR.L #3,D2 ;PERFORM Y/8;DISCARD REMAINDER
ADD.L D2,Dl ;PERFORM 5*X+6*Y +Y/8
FINISH JMP FINISH
However, I don't understand the first line. I guess that we have to convert it to 16 bits because Y is a 16-bit signed number, but I don't get how the ANDI
instruction works in this case.
Also, in the example they use ASR.L #3,D2
to divide D2 by 8; so if I use ASR.L #2,D2
then I'll divide D2 by 4? Or how does it work?
And if I use LSL.L #3,D2
then I would be multiplying D2 by 8?
And finally, why did they sign extend Y to 32 bits?
Thank you for your help!
I don't know 68000 asm, but ANDI will be AND with immediate I suppose, so result_16b = 0x00FF & X_8b;
This will make sure that upper 8bits of result are zero, and lower 8bits contains the X value.
Shifting left/right is actually multiplying/dividing by powers of 2.
Consider value 10
in binary 8b:
0000 1010
(8+2 = 10)
0001 0100
after shift to left by 1 position = 16+4 = 20 (10*2)
0000 0101
10 after shift to right by 1 position = 4+1 = 5 (10/2)
0000 0010
10 after shift to right by 2 positions = 2 (10/4 truncated)
So on some CPUs to do X*5 it may be faster to do (than to use MUL
):
x4 = (x<<2); // copy of x shifted left twice (*4)
x = x + x4; // sum of original plus copy
It's the same principle, as we do *10 and /10 in decimals just by moving digits left and right. Like 45 moved left by 2 positions is 45*100 = 4500. And 45 moved right by 1 position is 4 (45/10 truncated).
They probably extend last instructions to 32b, because the result should be D1.L? I think the previous MUL may result into 32b result values? Check instruction reference for details.
But I'm confused about those []
in the problem description, in other assemblers usage of [x]
usually denotes usage of value in "x" as address to fetch the value from memory, not to use value x directly. But it doesn't make much sense in your problem description, so it's probably just about values...
About AND and EXT. Yes, that's one way of looking at it, remembering that you extend unsigned numbers by AND, and signed by EXT, but when learning Assembly language, you should always strive also for the second perspective, which is "what it actually does", and learn both.
What it does is most of the time important on the bit level.
So you have 8bits of unsigned number X, and you want it to add to 16bit number tempY, what's the catch?
If you would just add tempY_low8b + X, you would get correct tempY_low8b, but the tempY_high8b may be missing +1, if the low8b add overflow.
How to fix that? You can either extend the X value to 16b, or detect overflow and fix tempY_high8b. On the 68k I assume it would be very impractical to manipulate high 8b of tempY (on x86 it may boil down to simple ADC ah,0
, if ax = ah:al
register is used as tempY), so it's easier to extend the X.
To extend X in terms of bit means to copy X to something capable to hold 16+ bits, and reset upper 8bits to zero (unsigned 8b value will have all the extended bits zero, when extended, "obvious" from how binary numbers work). As you have X in D0
, which has 16+ bits, you don't need to copy, just clear the upper bits. That's what AND
is good for, to clear particular bit by not providing it in a "mask" value, so when you do AND X,mask, every bit in result, where mask has 0 bit, will become zero too. Other bits of X (where mask has 1) will keep their value. So ANDI.W #$OOFF,DO
is doing that, masking 16b of D0
with mask 0000 0000 1111 1111
, clearing upper 8b, keeping lower 8b.
If you think about some problem in assembly, and you get to the point where you realize you need to reset bits i, j and k to zero, you can now recall there's single AND
capable to do that (with proper mask with i,j and k zeroed).
Now Y is signed number, so extending this one is more tricky. If the highest bit of Y is set to 0, it means the number is positive, and you can extend it by adding zeroes ahead of it. But if the highest bit is 1, it is negative, and to get the same negative value extended to more bits is to add ones ahead of it.
0010 = +2 in signed 4b -> extended to 8b -> 0000 0010 = +2 in signed 8b
1110 = -2 in signed 4b -> extended to 8b -> 1111 1110 = -2 in signed 8b
There's no basic bit operator with such function, as to "copy some bit up to the all other upper bits", so that's the reason why this has specialized instruction EXT
.
If you would have to stick to basic bit operations, you can do the same by shifting the narrow bits firstly up, to have the highest bit at highest bit position in wider version too, then use "arithmetic" shift right back to the original position. Arithmetic shift right will fill up the new bits with copy of the highest bit.
0010 to 8b: shift left 4 = 0010 0000 -> a.shift right 4 -> 0000 0010
1110 to 8b: shift left 4 = 1110 0000 -> a.shift right 4 -> 1111 1110
That's why most of the processors have two variants of shift bits right instruction, one is putting 0 into new bits, other is keeping the top bit. The left shift, while some CPUs do have two variants, is same in both, setting up the lower bit to 0.
So if did learn that you extend unsigned numbers by AND
and signed by EXT
, it will do, but if you keep also the knowledge about what's going on inside, you can sometimes figure out different possibilities.
For example:
EOR.W D1,D1
MOVE.B D0,D1
Will put into D1
16b unsigned number extended from 8b unsigned in D0
. I will let you figure this one on your own, as exercise. :)
So don't learn just blindly "how is done that and that", rather always check you fully understand the operations and their purpose. That will allow you more quickly to build your own solutions. After all, there's actually very few basic operations possible with CPU.
Generally all CPUs have (not always all of these, but from the basic ones enough of them to emulate the rest of it):
So if you learn the first three groups, and get used to think about them on the bits level, you can get quite far on any other CPU, after short learning of syntax of it's "different" assembler. These are quite universal. This also helps while programming in higher languages, so have better idea what is cheap native operation for CPU, and what is more complex calculation, so you can pick the simpler solution of your problem.