Search code examples
optimizationgccloopsgcc4tiling

How to enable Loop tiling in gcc?


How to compile a code using gcc, which performs loop tiling (Blocking) ? The -O3 optimization by default does not do loop tiling. I need to enable loop tiling in this flag and also, find out the tile factor. (E.g. cubic tiling or rectangular tiling) i.e. the internal tiling heuristics .

Thanks


Solution

  • You haven't provided the exact version of gcc, nor example code, nor result code, nor did you look hard enough at the internet, but possibly this already answers your question:

    Strip mining is an optimization that has been introduced into gcc with the merge of the graphite branch in version 4.4. See also the manual:

    -floop-strip-mine Perform loop strip mining transformations on loops. Strip mining splits a loop into two nested loops. The outer loop has strides equal to the strip size and the inner loop has strides of the original loop within a strip. The strip length can be changed using the loop-block-tile-size parameter. For example, given a loop like:

              DO I = 1, N
                A(I) = A(I) + C
              ENDDO
    

    loop strip mining will transform the loop as if the user had written:

              DO II = 1, N, 51
                DO I = II, min (II + 50, N)
                  A(I) = A(I) + C
                ENDDO
              ENDDO
    

    This optimization applies to all the languages supported by GCC and is not limited to Fortran. To use this code transformation, GCC has to be configured with --with-ppl and --with-cloog to enable the Graphite loop transformation infrastructure.

    You may run man gcc | grep '\-floop\-strip\-mine' to check if that is a supported option. For the exact gcc version, type gcc --version.