Search code examples
javascriptwebgpu

Is it possible to use an array of arrays in WebGPU?


How to create a buffer from an array of arrays in JavaScript and use it in a compute shader? For example: const arr = [new Float32Array([1, 3, 5]), new Float32Array([2, 4, 6]), ...] How to create a buffer for arr and declare it in the compute shader?

The intention is to create both an input and an output in the same format.


Solution

  • For WGSL you declare an array of arrays like this

    array<array<type, count1>, count2>
    

    For example

    array<array<vec4f, 3> 2>
    

    Is an array of 2 arrays of 3 vec4fs

    Back in JavaScript, how you read/write those is up to you,

    As one large array

    await buffer.mapAsync(...);
    const ab = buffer.getMappedRange();  // get the entire buffer
    
    const allData  =  new Float32Array(ab); // so 2 * 3 * 4 float32s
    

    Or arrays of typed arrays,

    const floatsPerRow = 3 * 4;            // 3 vec4fs
    const bytesPerRow = floatsPerRow * 4;
    
    const data = [
       new Float32Array(ab, bytesPerRow * 0, floatsPerRow);
       new Float32Array(ab, bytesPerRow * 1, floatsPerRow);
    ];
    

    Or arrays of arrays of arrays

    const floatsPerRow = 3 * 4;            // 3 vec4fs
    const bytesPerRow = floatsPerRow * 4;
    const bytesPerVec4f = 16;
    
    const data = [
      [
        new Float32Array(ab, bytesPerRow * 0 + bytesPerVec4f * 0, 4),
        new Float32Array(ab, bytesPerRow * 0 + bytesPerVec4f * 1, 4),
        new Float32Array(ab, bytesPerRow * 0 + bytesPerVec4f * 2, 4),
      ],
      [
        new Float32Array(ab, bytesPerRow * 1 + bytesPerVec4f * 0, 4),
        new Float32Array(ab, bytesPerRow * 1 + bytesPerVec4f * 1, 4),
        new Float32Array(ab, bytesPerRow * 1 + bytesPerVec4f * 2, 4),
      ],
    ];
    

    Of those, the first, just one Float32Array would arguably be the most common since making lots of views has a cost.

    As for uploading, you could either make typed arrays that view the same buffer (like above) and then upload the entire buffer. Or you can upload one smaller typed array at a time

    For example

    // WebGPU Simple Compute Shader
    // from https://webgpufundamentals.org/webgpu/webgpu-simple-compute.html
    
    
    async function main() {
      const adapter = await navigator.gpu?.requestAdapter();
      const device = await adapter?.requestDevice();
      if (!device) {
        fail('need a browser that supports WebGPU');
        return;
      }
    
      const module = device.createShaderModule({
        label: 'doubling compute module',
        code: `
          @group(0) @binding(0) var<storage, read_write> data: array<array<f32, 3>>;
    
          @compute @workgroup_size(1) fn computeSomething(
            @builtin(global_invocation_id) id: vec3<u32>
          ) {
            let col = id.x;
            let row = id.y;
            
            data[row][col] = data[row][col] * 2.0;
          }
        `,
      });
    
      const pipeline = device.createComputePipeline({
        label: 'doubling compute pipeline',
        layout: 'auto',
        compute: {
          module,
          entryPoint: 'computeSomething',
        },
      });
    
      const input = [
        new Float32Array([
          11, 12, 13,
        ]),
        new Float32Array([
          31, 32, 33,
        ]),
      ];
    
      // create a buffer on the GPU to hold our computation
      // input and output
      const workBuffer = device.createBuffer({
        label: 'work buffer',
        size: input[0].byteLength * input.length,
        usage: GPUBufferUsage.STORAGE | GPUBufferUsage.COPY_SRC | GPUBufferUsage.COPY_DST,
      });
      // Copy our input data to that buffer
      const bytesPerRow = 3 * 4
      device.queue.writeBuffer(workBuffer, bytesPerRow * 0, input[0]);
      device.queue.writeBuffer(workBuffer, bytesPerRow * 1, input[1]);
    
      // create a buffer on the GPU to get a copy of the results
      const resultBuffer = device.createBuffer({
        label: 'result buffer',
        size: workBuffer.size,
        usage: GPUBufferUsage.MAP_READ | GPUBufferUsage.COPY_DST,
      });
    
      // Setup a bindGroup to tell the shader which
      // buffer to use for the computation
      const bindGroup = device.createBindGroup({
        label: 'bindGroup for work buffer',
        layout: pipeline.getBindGroupLayout(0),
        entries: [
          { binding: 0, resource: { buffer: workBuffer } },
        ],
      });
    
      // Encode commands to do the computation
      const encoder = device.createCommandEncoder({
        label: 'doubling encoder',
      });
      const pass = encoder.beginComputePass({
        label: 'doubling compute pass',
      });
      pass.setPipeline(pipeline);
      pass.setBindGroup(0, bindGroup);
      pass.dispatchWorkgroups(input[0].length, input.length);
      pass.end();
    
      // Encode a command to copy the results to a mappable buffer.
      encoder.copyBufferToBuffer(workBuffer, 0, resultBuffer, 0, resultBuffer.size);
    
      // Finish encoding and submit the commands
      const commandBuffer = encoder.finish();
      device.queue.submit([commandBuffer]);
    
      // Read the results
      await resultBuffer.mapAsync(GPUMapMode.READ);
      const ab = resultBuffer.getMappedRange()
      const result = [
        new Float32Array(ab, bytesPerRow * 0, input[0].length),
        new Float32Array(ab, bytesPerRow * 1, input[0].length),
      ];
      console.log('input', input);
      console.log('result', result);
    
      resultBuffer.unmap();
    }
    
    function fail(msg) {
      // eslint-disable-next-line no-alert
      alert(msg);
    }
    
    main();
    @import url(https://webgpufundamentals.org/webgpu/resources/webgpu-lesson.css);

    In the example above the array is declared as

    @group(0) @binding(0) var<storage, read_write> data: array<array<f32, 3>>;
    

    Leaving off the length of the outer array means the size of the outer array is decided by the size of the buffer you bind at runtime.

    You might find this article about memory layout and this wsgl offset computer helpful