Halide cross-compilation for GPU

I want to run a Halide code on a GPU. There is a tutorial example on how to run on GPU and how to do cross-compilation. But, there is no tutorial combining cross-compilation with running on GPU.

I have tried to do the same way as the method in the cross-compilation tutorial. But I am not sure the configuration of the target.

target.os = Target::Windows; 
target.arch = Target::X86; 
target.bits = 64;
...

target.os = Target::Windows; // ???
target.arch = ??? ;
target.bits = 64;
std::vector<Target::Feature> gpu_features;
gpu_features.push_back(Target::OpenCL);
brighter.compile_to_file(...);

I develop the code in Ubuntu running on a virtual machine while the host machine OS is Windows. That's why I need to do the cross-compilation in order to run on a GPU.

Is it supported or not?

Solution

To expand a bit, Halide can generate code for multiple GPU/accelerator/offload architectures in a single filter. The Target specifies the architecture/OS/address-width for the CPU the host code will run on, plus the set of external execution APIs that are allowed. Funcs can be scheduled out to any of those APIs, e.g. OpenCL, OpenGL, Cuda, etc. In fact, in order to get anything to run on a GPU, appropriate scheduling must be provided.

The code that will run on the accelerator or GPU is embedded inside a function that runs on the CPU architecture specified by the top-level Target.

One can write Halide code that interoperates fairly seamlessly between OpenCL and OpenGL for example.

Ultimately we plan to support running GPU(etc) code from source code languages we support (currently C++, JavaScript is in a branch).

It is not possible to generate standalone code for a GPU. E.g. one cannot get OpenCL without the CPU code wrapper that invokes the code.