I want to know why some codes work fine when using standard arrays but fail when using CuArrays.
For example, I have an array time_idx
defined as:
1×32 CUDA.CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}:
0.71173 0.941251 0.571602 0.037198 0.212053 0.227296 0.457712 0.697708 0.788338 0.994031 0.228599 … 0.856314 0.830083 0.111376 0.0333812 0.722638 0.293733 0.114187 0.072304 0.275268
and a vector of CuArrays vehicle_states
, each with a dim of 7*32:
3-element Vector{CUDA.CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}:
[0.49417984 0.11234676 … 0.107337356 0.72619927; 0.46416637 0.21656695 … 0.18117706 0.18970703; … ; 0.15575896 0.79976654 … 0.3788491 0.29301012; 0.97315633 0.8638843 … 0.5506643 0.30244973]
[0.4448264 0.9205822 … 0.61369383 0.5310524; 0.75463957 0.29982162 … 0.13896087 0.09793778; … ; 0.60275537 0.39284942 … 0.2803427 0.7379274; 0.8305204 0.056631837 … 0.16771089 0.9385667]
[0.78282833 0.594285 … 0.65157485 0.82812166; 0.28565544 0.021899216 … 0.7051293 0.48643407; … ; 0.18139555 0.44223073 … 0.9017556 0.3409817; 0.5128845 0.79966474 … 0.039010685 0.53230214]
I want to concatenate them using broadcast behavior but an error occurred (this is fine when using standard arrays):
vcat.(time_idx, vehicle_states) # CuArray only supports element types that are stored inline
But if I don't use broadcasting, it will work just fine:
[vcat(time_idx, vehicle_state) for vehicle_state in vehicle_states]
3-element Vector{CUDA.CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}:
[0.7117298 0.94125116 … 0.07230403 0.27526757; 0.49417984 0.11234676 … 0.107337356 0.72619927; … ; 0.15575896 0.79976654 … 0.3788491 0.29301012; 0.97315633 0.8638843 … 0.5506643 0.30244973]
[0.7117298 0.94125116 … 0.07230403 0.27526757; 0.4448264 0.9205822 … 0.61369383 0.5310524; … ; 0.60275537 0.39284942 … 0.2803427 0.7379274; 0.8305204 0.056631837 … 0.16771089 0.9385667]
[0.7117298 0.94125116 … 0.07230403 0.27526757; 0.78282833 0.594285 … 0.65157485 0.82812166; … ; 0.18139555 0.44223073 … 0.9017556 0.3409817; 0.5128845 0.79966474 … 0.039010685 0.53230214]
Why is that?
when you try to run:
vcat.(time_idx, vehicle_states)
it's probably trying to make the outer container CuArray
instead of Vector
, if you look at this variable's type:
3-element Vector{CUDA.CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}:
the outer type is just Vector
, it's a Vector
of CuArray
. And more importantly, you cannot have a CuArray
of CuArray
because of the same reason outlined in the error message.
The error message is basically saying, each element inside the CuArray
has to be isbits
(that's the only way to store them in VRAM) but when you have a Vector of Vector, each element is a "pointer to a vector", that's not "just some bits", and thus not GPU compatible.