Search code examples

Convert Tensorflow graph to CoreML

I'm trying to convert a Tensorflow graph to CoreML and I'm following this tutorial. There's this bit of code that I don't quite understand:

#include <metal_stdlib>
using namespace metal;

kernel void swish(
  texture2d_array<half, access::read> inTexture [[texture(0)]],
  texture2d_array<half, access::write> outTexture [[texture(1)]],
  ushort3 gid [[thread_position_in_grid]])
  if (gid.x >= outTexture.get_width() || 
      gid.y >= outTexture.get_height()) {

  const float4 x = float4(, gid.z));
  const float4 y = x / (1.0f + exp(-x));             
  outTexture.write(half4(y), gid.xy, gid.z);

What I don't understand is the use of gid here. Isn't the grid 2 dimensional? What does gid.z signify? Isn't gid.x is the current x-coordinate of the current pixel?


  • gid.x and gid.y are the x/y coordinate of the current pixel. So when you do a it gives you 4 channels worth of pixel data.

    But the "images" used in neural networks may have many more than 4 channels. That's why the data type for the textures is texture2d_array<> instead of just texture2d<>.

    The gid.z value refers to the index of the texture "slice" in this array. If the image/tensor has 32 channels, then there are 8 texture slices (because each texture stores up to 4 channels of data).

    So the grid really is three dimensional: (x, y, slice).