Search code examples
assemblycpu-architectureinstructions

Replacing two instructions with one instruction in assembly language


enter image description here

Here is a table of a number of instructions given in a program. The first column indicates the instruction itself, the second column indicates the percent of weightage of any instruction in a program, and the third column indicates the CPI(cycles per instruction) of any given instruction.

Now, I want to replace the add and mult with one instruction madd.

madd takes three inputs, multiplies the first two inputs and add the result to the third input. This way madd instruction does both the multiplication and addition.

Current weight of both add and mult is 40%. So the question is when I replace the mult and add with madd what will be the weight of new madd instruction?


Solution

  • Current weight of both add and mult is 40%. So the question is when I replace the mult and add with madd what will be the weight of new madd instruction?

    If the program consists of additions (that consume 30% of CPU time) in one place, and multiplications (that consume 10% of CPU time) in a completely different place that has nothing to do with the additions; then it may be completely impossible to replace anything with a "multiply and add" instruction.

    If the program consists of one addition (that consume 30% of CPU time by itself) and 1000 multiplications (that consume 0.01% of CPU time each) then for "best case" you can only replace one addition and one multiplication and you'd still have 999 "unpaired" multiplications.

    If the program consists of 1000 additions (that consume 0.03% of CPU time each) and one multiplication (that consumes 10% of CPU time by itself) then for "best case" you can only replace one addition and one multiplication and you'd still have 999 "unpaired" additions.

    If you know that there are an equal number of additions and multiplications and all of them can be paired and replaced by "multiply and add" instructions without changing the number of loads, stores or branches; then you have no idea how fast/slow a "multiply and add" instruction is (or at least you haven't provided that information) and therefore it would still impossible to figure out how it would effect performance.

    In other words, you've got almost none of the information that would be needed to find an answer.