Search code examples
fortrantensorblasnumpy-einsum

What is TDOT subroutine in BLAS?


I tried to use opt-einsum to generate contraction path for Fortran implementation and I came across an expression TDOT https://optimized-einsum.readthedocs.io/en/stable/greedy_path.html?highlight=tdot

scaling BLAS current remaining

4 TDOT tfp,fr->tpr tpr->tpr

I cannot find it in http://www.netlib.org/lapack/explore-html/index.html or other BLAS webpage after a few searching :(

What is it in BLAS?


Solution

  • This is correct, TDOT is not part of BLAS. If we look at this particular expression we see that the data needs to be organized before a BLAS call can happen as the "zip" index f is not on the left or right hand side of each tensor.

    TDOT            tfp,fr->tpr
    ---
    tmp_tpf = tfp -> tpf # Reogrganize data
    tmp_tpf,fr -> tpr    # Standard BLAS call with `f` as the zip index
    

    There are some libraries such as TBLIS which expand upon BLAS functionality and allow for non-contiguous expressions by transposing data as blocks are moved from RAM to cache for extremely high performance. TDOT is explicitly stated in the opt_einsum docs since it is generally not a good thing due to the memory copy before contraction; the memory copy can often be the bottleneck!

    Quick note I'm the author of opt_einsum, I would love a PR if you get the chance!