Search code examples
c++c++-faqstd-spanmdspan

What is an mdspan, and what is it used for?


Over the past year or so I've noticed a few C++-related answers on StackOverflow refer to mdspan's - but I've never actually seen these in C++ code. I tried looking for them in my C++ compiler's standard library directory and in the C++ coding guidelines - but couldn't find them. I did find std::span's; I'm guessing they're related - but how? And what does this addition of "md" stand for?

Please explain what this mysterious entity is about, and when I might want to use it.


Solution

  • TL;DR: mdspan is an extension of std::span for multiple dimensions - with a lot of (unavoidable) flexibile configurability w.r.t. memory layout and modes of access.


    Before you read this answer, you should make sure you're clear on what a span is and what it's used for. Now that that's out of the way: Since mdspan's can be rather complex beasts (typically ~7x or more source code as an std::span implementation), we'll start with a simplified description, and keep the advanced capabilities for further below.

    "What is it?" (simple version)

    An mdspan<T> is:

    1. Literally, a "multi-dimensional span" (of type-T elements).
    2. A generalization of std::span<T>, from a uni-dimensional/linear sequence of elements to multiple dimensions.
    3. A non-owning view of a contiguous sequence of elements of type T in memory, interpreted as a multi-dimensional array.
    4. Basically just a struct { T * ptr; size_type extents[d]; } with some convenience methods (for d dimensions determined at run-time).

    Illustration of mdspan-interpreted layout

    If we have:

    std::vector v = {1,2,3,4,5,6,7,8,9,10,11,12};
    

    we can view the data of v as a 1D array of 12 elements, similar to its original definition:

    auto sp1 = std::span(v.data(), 12);
    auto mdsp1 = std::mdspan(v.data(), 12);
    

    or a 2D array of extents 2 x 6:

    auto mdsp2 = std::mdspan(v.data(), 2, 6);
    // (  1,  2,  3,  4,  5,  6 ),
    // (  7,  8,  9, 10, 11, 12 )
    

    or a 3D array 2 x 3 x 2:

    auto ms3 = std::mdspan(v.data(), 2, 3, 2);
    // ( ( 1,  2 ), ( 3,  4 ), (  5,  6 ) ),
    // ( ( 7,  8 ), ( 9, 10 ), ( 11, 12 ) )
    

    and we could also consider it as a 3 x 2 x 2 or 2 x 2 x 3 array, or 3 x 4 and so on.

    "When should I use it?"

    • (C++23 and later) When you want to use the multi-dimensional operator[] on some buffer you get from somewhere. Thus in the example above, ms3[1, 2, 0] is 11 and ms3[0, 1, 1] is 4 .

    • When you want to pass multi-dimensional data without separating the raw data pointer and the dimensions. You've gotten a bunch of elements in memory, and want to refer to them using more than one dimension. Thus instead of:

      void print_matrix_element(
         float const* matrix, size_t row_width, size_t x, size_t y) 
      {
         std::print("{}", matrix[row_width * x + y]);
      }
      

      you could write:

      void print_matrix_element(
          std::mdspan<float const, std::dextents<size_t, 2>> matrix,
          size_t x, size_t y)
      {
         std::print("{}", matrix[x, y]);
      }
      
    • As the right type for passing multidimensional C arrays around:
      C supports multidimensional arrays perfectly... as long as their dimensions are given at compile time, and you don't try passing them to functions. Doing that is a bit tricky because the outermost dimension experiences decay, so you would actually be passing a pointer. But with mdspans, you can write this:

      template <typename T, typename Extents>
      void print_3d_array(std::mdspan<T, Extents> ms3)
      {
         static_assert(ms3.rank() == 3, "Unsupported rank");
         // read back using 3D view
         for(size_t i=0; i != ms3.extent(0); i++) {
           fmt::print("slice @ i = {}\n", i);
           for(size_t j=0; j != ms3.extent(1); j++) {
             for(size_t k=0; k != ms3.extent(2); k++)
               fmt::print("{} ",  ms3[i, j, k]);
             fmt::print("\n");
           }
         }  
      }
      
      int main() {
          int arr[2][3][2] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 };
      
          auto ms3 = std::mdspan(&arr[0][0][0], 2, 3, 2);
            // Note: Not the most elegant construction
      
          print_3d_array(ms3);
      }
      

    Standardization status

    std::mdspan has been standardized, and is part of C++23 (while std::span was already available in C++20). However, not all relevant facilities were standardized, e.g. the ability to take a sub-mdspan of an existing md-span will likely only enter the standard in C++26.

    Even before C++23, you could already use a reference implementation, which comes from the US' Sandia National Laboratory's "Kokkos performance portability ecosystem".

    "What are those 'extra capabilities' which mdspan offers?"

    An mdspan actually has 4 template parameters, not just the element type and the extents:

    template <
        class T,
        class Extents,
        class LayoutPolicy = layout_right,
        class AccessorPolicy = default_accessor<ElementType>
    >
    class mdspan;
    

    This answer is already rather long, so we won't give the full details, but:

    • Some of the extents can be "static" rather than "dynamic", specified in compile-time, and thus not stored in instance data members. Only the "dynamic" instances are stored. For example, this:

      auto my_extents extents<dynamic_extent, 3, dynamic_extent>{ 2, 4 };
      

      ... is an extents objects corresponding to dextents<size_t>{ 2, 3, 4 }, but which only stores the values 2 and 4 in the class instance; with the compiler knowing it needs to plug in 3 whenever the second dimension is used.

    • You can have the dimensions go from-minor-to-major, in Fortran style instead of from-major-to-minor like in C. Thus, if you set LayoutPolicy = layout_left, then mds[x,y] is at mds.data[mds.extent(0) * y + x] instead of the usual mds.data[mds.extent(1) * x + y].

    • You can "reshape" your mdspan into another mdspan with different dimensions but the same overall size.

    • You can define a layout policy with "strides": Have consecutive elements in the mdspan be at a fixed distance in memory; have extra offsets and the beginning and/or the end of each line or dimensional slice; etc.

    • You can "cut up" your mdspan with offsets in every dimension (e.g. take a submatrix of a matrix) - and the result is still an mdspan! ... that's because you can have an mdspan with a LayoutPolicy which incorporates these offsets. This functionality is not available in C++23 IIANM.

    • Using the AccessorPolicy, you can make mdspan's which actually do own the data they refer to, individually or collectively.

    Further reading

    (some examples were adapted from these sources.)