I tried to use opt-einsum to generate contraction path for Fortran implementation and I came across an expression TDOT
https://optimized-einsum.readthedocs.io/en/stable/greedy_path.html?highlight=tdot
scaling BLAS current remaining
4 TDOT tfp,fr->tpr tpr->tpr
I cannot find it in http://www.netlib.org/lapack/explore-html/index.html or other BLAS webpage after a few searching :(
What is it in BLAS?
This is correct, TDOT is not part of BLAS. If we look at this particular expression we see that the data needs to be organized before a BLAS call can happen as the "zip" index f
is not on the left or right hand side of each tensor.
TDOT tfp,fr->tpr
---
tmp_tpf = tfp -> tpf # Reogrganize data
tmp_tpf,fr -> tpr # Standard BLAS call with `f` as the zip index
There are some libraries such as TBLIS which expand upon BLAS functionality and allow for non-contiguous expressions by transposing data as blocks are moved from RAM to cache for extremely high performance. TDOT is explicitly stated in the opt_einsum
docs since it is generally not a good thing due to the memory copy before contraction; the memory copy can often be the bottleneck!
Quick note I'm the author of opt_einsum
, I would love a PR if you get the chance!