Search code examples
c++cudagputhrust

Is it possible to overcome the maximum number of iterators in thrust::zip_iterator?


I’m using Thrust for some tasks at work and have found that there seems to be a maximum number of iterators when constructing a zip_iterator.

For example

#include <thrust/iterator/zip_iterator.h>
#include <thrust/device_vector.h>

int main() {

  thrust::device_vector<int> A(10),B(10),C(10);

  auto zitor = thrust::make_zip_iterator(A.begin(),A.begin(),
                                         B.begin(),B.begin(),
                                         B.begin(),B.begin(),
                                         B.begin(),B.begin(),
                                         C.begin(),C.begin());

}

This code compiles successfully. But if I add one more iterator in the function parameters, an error occurs:

multizip.cu(8): error: no instance of overloaded function "thrust::make_zip_iterator" matches the argument list
            argument types are: (thrust::detail::normal_iterator<thrust::device_ptr<int>>, thrust::detail::normal_iterator<thrust::device_ptr<int>>, thrust::detail::normal_iterator<thrust::device_ptr<int>>, thrust::detail::normal_iterator<thrust::device_ptr<int>>, thrust::detail::normal_iterator<thrust::device_ptr<int>>, thrust::detail::normal_iterator<thrust::device_ptr<int>>, thrust::detail::normal_iterator<thrust::device_ptr<int>>, thrust::detail::normal_iterator<thrust::device_ptr<int>>, thrust::detail::normal_iterator<thrust::device_ptr<int>>, thrust::detail::normal_iterator<thrust::device_ptr<int>>, thrust::detail::normal_iterator<thrust::device_ptr<int>>)

1 error detected in the compilation of "multizip.cu".

Does it accept at most ten iterators, or am I misunderstanding something? If so, are there any workarounds?


  using Itor  = thrust::device_vector<int>::iterator;
  using Zitor = thrust::zip_iterator<thrust::tuple<
                  Itor,Itor,Itor,Itor,Itor,Itor,
                  Itor,Itor,Itor,Itor,Itor>>;

This piece of code does not work either, unless I remove one Itor, the error info is:

multizip.cu(17): error: too many arguments for class template "thrust::tuple"

So I believe the number of iterators in thrust::tuple does indeed have a maximum limit of 10. Can one overcome this limitation?


Solution

  • The proximal issue you are running into is that until recently, Thrust had a template limit of 10 items (iterators) in the thrust::zip_iterator. There are three options for fixing/working around the issue:

    1. Recently (appears to be going from CUDA 12.3 to CUDA 12.4), the Thrust zip_iterator design has changed to allow more than 10 iterators in construction of a zip_iterator. So one option would be to update your CUDA toolkit version to 12.4.1 or later.

    2. Another related option would be to use a more recent version of the CCCL* with your current CUDA Toolkit. Recent versions of the CCCL have a defined compatibility path for this. I'm not giving a complete recipe here, but in a nutshell, assuming you are within the defined compatibility path, you would clone the current CCCL repository on your machine, then point your CUDA compiler to that repository with -I**. Thrust is a template/header-only library, so it does not need a separate compilation step or any other installation steps.

    3. Depending on your needs, staying within the 10-iterator-limit of older Thrust versions, you could possibly create a zip_iterator of zip_iterators. Here is an example of that. This can be used to get to more than 10 iterators, albeit in a nested arrangement.


    *: The CUDA C++ Core Libraries were introduced in 2023 to contain Thrust, CUB and libcu++/libcudacxx in a single repository due to their interactions/overlap. The original Thrust repository stopped being updated and does not contain the modernized zip_iterator.

    **: Do not use -isystem as that would get lower priority than the CCCL headers packaged with the CUDA Toolkit.