Search code examples
ccudagpgputhrust

How to avoid default construction of elements in thrust::device_vector?


  1. It seems when creating a new Thrust vector all elements are 0 by default - I just want to confirm that this will always be the case.

  2. If so, is there also a way to bypass the constructor responsible for this behavior for additional speed (since for some vectors I don't need them to have an initial value, e.g. if their raw pointers are being passed to CUBLAS as an output)?


Solution

  • thrust::device_vector constructs the elements it contains using its supplied allocator, just like std::vector. It's possible to control what the allocator does when the vector asks it to construct an element.

    Use a custom allocator to avoid default-initialization of vector elements:

    // uninitialized_allocator is an allocator which
    // derives from device_allocator and which has a
    // no-op construct member function
    template<typename T>
      struct uninitialized_allocator
        : thrust::device_malloc_allocator<T>
    {
      // note that construct is annotated as
      // a __host__ __device__ function
      __host__ __device__
      void construct(T *p)
      {
        // no-op
      }
    };
    
    // to make a device_vector which does not initialize its elements,
    // use uninitialized_allocator as the 2nd template parameter
    typedef thrust::device_vector<float, uninitialized_allocator<float> > uninitialized_vector;
    

    You will still incur the cost of a kernel launch to invoke uninitialized_allocator::construct, but that kernel will be a no-op which will retire quickly. What you're really interested in is avoiding the memory bandwidth required to fill the array, which this solution does.

    There's a complete example code here.

    Note that this technique requires Thrust 1.7 or better.