Search code examples
gpumetal

Is there a way where the gpu can change one variable simultaneously across multiple threads?


Look at this code, main.swift:

import Metal

let device = MTLCreateSystemDefaultDevice()!
let lib = device.makeDefaultLibrary()!
let function = lib.makeFunction(name: "testout")!
let pso : MTLComputePipelineState
do {
    pso = try device.makeComputePipelineState(function: function)
}

let commandQueue = device.makeCommandQueue()!
// generate buffer
let buffer = device.makeBuffer(length: MemoryLayout<Float>.stride, options: .storageModeShared)!
// set the variable "a" in our testout.metal to the value -5
buffer.contents().assumingMemoryBound(to: Float.self).assign(repeating: -5, count: 1)

let commandBuffer = commandQueue.makeCommandBuffer()!
let computeEncoder = commandBuffer.makeComputeCommandEncoder()!
computeEncoder.setComputePipelineState(pso)
computeEncoder.setBuffer(buffer, offset: 0, index: 0)

// run it 3 times
let gridSize = MTLSizeMake(3, 1, 1)

computeEncoder.dispatchThreads(gridSize, threadsPerThreadgroup: gridSize)
computeEncoder.endEncoding()
commandBuffer.commit()
commandBuffer.waitUntilCompleted()

print(buffer.contents().assumingMemoryBound(to: Float.self).pointee)

testout.metal:

#include <metal_stdlib>
using namespace metal;

kernel void testout (device float* a) {
    a[0] += 2;
}

Is there a function / program that can output 1? Is it even possible? (It runs 3 times, original value is -5, metal increases a by 2, -5 + (3 * 2) = 1)

Any answer would be appreciated :)


Solution

  • Thanks to Ken Thomases, all I had to do is to change testout.metal into this:

    kernel void testout (device atomic_int &a) {
        atomic_fetch_add_explicit(&a, 2, memory_order_relaxed);
    }
    

    And in the main script, I had to change the datatype of the buffer into Int32.