Thrust How could i acces my flatten array with a thrust::make_zip_iterator

Could some one could explain me why i can't access to my data.

I got a flatten vector

thrust::host_vector<double> input(10*3);

inside i have points data X,Y,Z

i make try to use a zip_iterator to access to my data so i make :

typedef thrust::tuple<double, double, double, int>  tpl4int;
typedef thrust::host_vector<double>::iterator doubleiter;
typedef thrust::host_vector<int>::iterator intiter;

typedef thrust::tuple<doubleiter, doubleiter, doubleiter, intiter>  tpl4doubleiter;
typedef thrust::zip_iterator<tpl4doubleiter>  tpl4zip;

tpl4zip first = thrust::make_zip_iterator(thrust::make_tuple(input.begin(), input.begin() + N/3, input.begin() + 2*N/3, K.begin()));

I try to acces to my data like this :

  std::vector<tpl4int> result_sorted(N);
  thrust::copy(first,first+N/3,result_sorted.begin());  

  std::cout << "row 0 = " << result_sorted[0].get<0>() << std::endl;
  std::cout << "row 1 = " << result_sorted[0].get<1>() << std::endl;
  std::cout << "row 2 = " << result_sorted[0].get<2>() << std::endl;
  std::cout << "row 0 = " << result_sorted[0].get<3>() << std::endl;

but i didn't get the expected result

X = 1.0245
Y = 1.0215
Z = 5.001
index = 0

instead of

      input[0] = 1.0245;
      input[1] = 2.54;
      input[2] = 3.001;
      index    = 0;

could someone tell my where i'm wrong ?

here the full code

#include <thrust/host_vector.h>
#include <thrust/iterator/zip_iterator.h>
#include <thrust/sequence.h>
#include <thrust/fill.h>
#include <thrust/tuple.h>

#define N 30 // make this evenly divisible by 3 for this example

typedef thrust::tuple<double, double, double, int>  tpl4int;
typedef thrust::host_vector<double>::iterator doubleiter;
typedef thrust::host_vector<int>::iterator intiter;

typedef thrust::tuple<doubleiter, doubleiter, doubleiter, intiter>  tpl4doubleiter;
typedef thrust::zip_iterator<tpl4doubleiter>  tpl4zip;

int main() 
{
   thrust::host_vector<double> input(10*3);

      int i=0;

//     input[0] = vec3(0,0,5.005);
      input[i++] = 1.0245;
      input[i++] = 2.54;
      input[i++] = 3.001;

//     input[1] = vec3(0,0,5.005);
      input[i++] = 2.0;
      input[i++] = 1.0;
      input[i++] = 5.01125;

//     input[2] = vec3(0,0,5.005);
      input[i++] = 6.0;
      input[i++] = 1.0;
      input[i++] = 5.0145;

    
//     input[3] = vec3(2,1,5.001);
      input[i++] = 6.0;
      input[i++] = 1.0215;
      input[i++] = 6.001;

//     input[4] = vec3(3,0,5.001);
      input[i++] = 6.0;
      input[i++] = 1.0845;
      input[i++] = 5.00125;

//     input[5] = vec3(4,0,5.001);
      input[i++] = 5.0;
      input[i++] = 0.0;
      input[i++] = 5.001;
    
//     input[6] = vec3(5,0,5.001);
      input[i++] = 5.0;
      input[i++] = 0.0;
      input[i++] = 5.001;

//     input[7] = vec3(6,0,10.501);
      input[i++] = 6.0;
      input[i++] = 0.0;
      input[i++] = 10.501;

//     input[8] = vec3(0,0,5.001);
      input[i++] = 1.0;
      input[i++] = 0.0;
      input[i++] = 5.0015478;

//     input[8] = vec3(0,0,5.001);
      input[i++] = 6.0;
      input[i++] = 1.005;
      input[i++] = 5.001;
      

  thrust::host_vector<int> K(N/3);          // keys in one row
  thrust::sequence(K.begin(), K.end(), 0);  // set index for key

  tpl4zip first = thrust::make_zip_iterator(thrust::make_tuple(input.begin(), input.begin() + N/3, input.begin() + 2*N/3, K.begin()));

  std::vector<tpl4int> result_sorted(N/3);
  thrust::copy(first,first+N/3,result_sorted.begin());  

  std::cout << "row 0 = " << result_sorted[0].get<0>() << std::endl;
  std::cout << "row 1 = " << result_sorted[0].get<1>() << std::endl;
  std::cout << "row 2 = " << result_sorted[0].get<2>() << std::endl;
  std::cout << "row 0 = " << result_sorted[0].get<3>() << std::endl;

  return 0;
}

Thanks in advance..

Solution

Because this is how a zip iterator works: It takes multiple buffers (Struct of Arrays: SoA) and lets you access them as if you had one buffer of tuples (Array of Structs: AoS).

The first element accessed by your zip iterator is the tuple input[0], input[10] and input[20] (ignoring the integer). The second element is input[1], input[11] and input[21] and so on. This should somewhat make sense to you when looking at how you initialized your zip iterator with input.begin(), input.begin() + N/3 and input.begin() + 2*N/3 instead of input.begin(), input.begin() + 1 and input.begin() + 2 (which would not work as you would need an iterator with configurable stride to do it this way).

Using a zip iterator is often good for performance on the GPU, because it allows for memory coalescing. It might be a good practice on the CPU as well if you want to make use of SIMD vectorization.

If your input comes in AoS format, you will have to "transpose" it to use the zip iterator correctly. Depending on the use case this "transpose" might not be worth the effort, as it is an expensive operation on big vectors. Ideally you would get three input vectors x, y and z in the first place.

Another good reason to use zip iterators is that they allow you to use more inputs or outputs per operation for many algorithms.