The BLAS gemm routine in the Intel MKL normally works with three matrices. It's like giving you f(A, B, C) = alpha A * B + beta C
where alpha and beta are scaling factors.
But can one write f(A, B, A)
with alpha=1,beta=0
in order to simply get A = A * B? I mean the two A
's in f(A, B, A)
are the same variable. (All square matrices here.)
Certainly, if we set a third variable C=A
, f(A, B, C)
works. But it will be much better without even making this copy C
.
No, this is not allowed. You either have to introduce a temporary buffer or find another way to do the job.