Is it possible to use an array of arrays in WebGPU?

How to create a buffer from an array of arrays in JavaScript and use it in a compute shader? For example: const arr = [new Float32Array([1, 3, 5]), new Float32Array([2, 4, 6]), ...] How to create a buffer for arr and declare it in the compute shader?

The intention is to create both an input and an output in the same format.

Solution

For WGSL you declare an array of arrays like this

array<array<type, count1>, count2>

For example

array<array<vec4f, 3> 2>

Is an array of 2 arrays of 3 vec4fs

Back in JavaScript, how you read/write those is up to you,

As one large array

await buffer.mapAsync(...);
const ab = buffer.getMappedRange();  // get the entire buffer

const allData  =  new Float32Array(ab); // so 2 * 3 * 4 float32s

Or arrays of typed arrays,

const floatsPerRow = 3 * 4;            // 3 vec4fs
const bytesPerRow = floatsPerRow * 4;

const data = [
   new Float32Array(ab, bytesPerRow * 0, floatsPerRow);
   new Float32Array(ab, bytesPerRow * 1, floatsPerRow);
];

Or arrays of arrays of arrays

const floatsPerRow = 3 * 4;            // 3 vec4fs
const bytesPerRow = floatsPerRow * 4;
const bytesPerVec4f = 16;

const data = [
  [
    new Float32Array(ab, bytesPerRow * 0 + bytesPerVec4f * 0, 4),
    new Float32Array(ab, bytesPerRow * 0 + bytesPerVec4f * 1, 4),
    new Float32Array(ab, bytesPerRow * 0 + bytesPerVec4f * 2, 4),
  ],
  [
    new Float32Array(ab, bytesPerRow * 1 + bytesPerVec4f * 0, 4),
    new Float32Array(ab, bytesPerRow * 1 + bytesPerVec4f * 1, 4),
    new Float32Array(ab, bytesPerRow * 1 + bytesPerVec4f * 2, 4),
  ],
];

Of those, the first, just one Float32Array would arguably be the most common since making lots of views has a cost.

As for uploading, you could either make typed arrays that view the same buffer (like above) and then upload the entire buffer. Or you can upload one smaller typed array at a time

For example

// WebGPU Simple Compute Shader
// from https://webgpufundamentals.org/webgpu/webgpu-simple-compute.html


async function main() {
  const adapter = await navigator.gpu?.requestAdapter();
  const device = await adapter?.requestDevice();
  if (!device) {
    fail('need a browser that supports WebGPU');
    return;
  }

  const module = device.createShaderModule({
    label: 'doubling compute module',
    code: `
      @group(0) @binding(0) var<storage, read_write> data: array<array<f32, 3>>;

      @compute @workgroup_size(1) fn computeSomething(
        @builtin(global_invocation_id) id: vec3<u32>
      ) {
        let col = id.x;
        let row = id.y;
        
        data[row][col] = data[row][col] * 2.0;
      }
    `,
  });

  const pipeline = device.createComputePipeline({
    label: 'doubling compute pipeline',
    layout: 'auto',
    compute: {
      module,
      entryPoint: 'computeSomething',
    },
  });

  const input = [
    new Float32Array([
      11, 12, 13,
    ]),
    new Float32Array([
      31, 32, 33,
    ]),
  ];

  // create a buffer on the GPU to hold our computation
  // input and output
  const workBuffer = device.createBuffer({
    label: 'work buffer',
    size: input[0].byteLength * input.length,
    usage: GPUBufferUsage.STORAGE | GPUBufferUsage.COPY_SRC | GPUBufferUsage.COPY_DST,
  });
  // Copy our input data to that buffer
  const bytesPerRow = 3 * 4
  device.queue.writeBuffer(workBuffer, bytesPerRow * 0, input[0]);
  device.queue.writeBuffer(workBuffer, bytesPerRow * 1, input[1]);

  // create a buffer on the GPU to get a copy of the results
  const resultBuffer = device.createBuffer({
    label: 'result buffer',
    size: workBuffer.size,
    usage: GPUBufferUsage.MAP_READ | GPUBufferUsage.COPY_DST,
  });

  // Setup a bindGroup to tell the shader which
  // buffer to use for the computation
  const bindGroup = device.createBindGroup({
    label: 'bindGroup for work buffer',
    layout: pipeline.getBindGroupLayout(0),
    entries: [
      { binding: 0, resource: { buffer: workBuffer } },
    ],
  });

  // Encode commands to do the computation
  const encoder = device.createCommandEncoder({
    label: 'doubling encoder',
  });
  const pass = encoder.beginComputePass({
    label: 'doubling compute pass',
  });
  pass.setPipeline(pipeline);
  pass.setBindGroup(0, bindGroup);
  pass.dispatchWorkgroups(input[0].length, input.length);
  pass.end();

  // Encode a command to copy the results to a mappable buffer.
  encoder.copyBufferToBuffer(workBuffer, 0, resultBuffer, 0, resultBuffer.size);

  // Finish encoding and submit the commands
  const commandBuffer = encoder.finish();
  device.queue.submit([commandBuffer]);

  // Read the results
  await resultBuffer.mapAsync(GPUMapMode.READ);
  const ab = resultBuffer.getMappedRange()
  const result = [
    new Float32Array(ab, bytesPerRow * 0, input[0].length),
    new Float32Array(ab, bytesPerRow * 1, input[0].length),
  ];
  console.log('input', input);
  console.log('result', result);

  resultBuffer.unmap();
}

function fail(msg) {
  // eslint-disable-next-line no-alert
  alert(msg);
}

main();

@import url(https://webgpufundamentals.org/webgpu/resources/webgpu-lesson.css);

In the example above the array is declared as

@group(0) @binding(0) var<storage, read_write> data: array<array<f32, 3>>;

Leaving off the length of the outer array means the size of the outer array is decided by the size of the buffer you bind at runtime.

You might find this article about memory layout and this wsgl offset computer helpful