time-complexity cpu-architecture multiplication addition alu

Effective time complexity for n bit addition and multiplication

I have done a course on Computer Architecture and it was mentioned that on the most efficient processors with n bit architecture word size the addition/subtraction of two words has a time complexity of O(log n) while multiplication/division has a time complexity of O(n).

If you do not consider any particular architecture word size the best time complexity of addition/subtraction is O(n) (https://www.academia.edu/42811225/Fast_Arithmetic_Speeding_up_Multiplication_Division_and_Addition_of_n_Bit_Numbers) and multiplication/division seems to be O(n log n log log n) (Strassen https://en.m.wikipedia.org/wiki/Multiplication_algorithm).

Is this correct?

Solution

O(log n) is the latency of addition if you can use n-bit wide parallel hardware with stuff like carry-select or carry-lookahead.

O(n) is the total amount of work that needs doing, and thus the time complexity with a fixed-width ALU for arbitrary bigint problems as n tends towards infinity.

For a multiply, there are n partial products in an n-bit multiply, so adding them all (with a Dadda tree for example) takes on the order of O(log n) gate delays of latency. Integer addition is associative, so you can do that in parallel, e.g. (a+b) + (c+d) is 3 with the latency of 2, and it gets better from there.

Dadda trees can avoid some of the carry-propagation latency so I guess it avoids the extra factor of log n you'd get if you you just used normal addition of each partial product separately.

See Differences between Wallace Tree and Dadda Multipliers for more about practical considerations for huge Dadda trees.