Search code examples
algorithmrecursionperformancestrassen

Crossover Point: Strassen's Algorithm


What is the optimal crossover point under which Strassen's Algorithm should stop the recursion and apply the multiplication, in terms of efficiency?

I know this is closely connected to the specific implementation and hardware but there should be some kind of guideline or some experimental results from someone for a general case.

Searching a bit online and asking some people they tend to think it's

n = 64; 

or

 n = 32;

Can anyone verify / reject these results?


Solution

  • This should be tuned on a per machine basis (a bit like ATLAS does). This kind of optimizations pay off for quite large matrices: if you code it yourself, and compare it to eg. a vendor BLAS implementation, then you would find a quite large n.

    Memory requirements for Strassen algorithm are also something to weigh.