I would like to design this loss function:
sum((y[argmax(y_)] - y_[argmax(y_)])²)
I don't find a way to do y[argmax(y_)]
. I tried y[k]
, y[:,k]
and y[None,k]
none of these work. This is my code :
Na = 3
x = tf.placeholder(tf.float32, [None, 2])
W = tf.Variable(tf.zeros([2, Na]))
b = tf.Variable(tf.zeros([Na]))
y = tf.nn.relu(tf.matmul(x, W) + b)
y_ = tf.placeholder(tf.float32, [None, 3])
k = tf.argmax(y_, 1)
diff = y[k] - y_[k]
loss = tf.reduce_sum(tf.square(diff))
And the error:
File "/home/ncarrara/phd/code/cython/robotnavigation/ftq/cftq19.py", line 156, in <module>
diff = y[k] - y_[k]
File "/home/ncarrara/miniconda3/lib/python2.7/site-packages/tensorflow/python/ops/array_ops.py", line 499, in _SliceHelper
name=name)
File "/home/ncarrara/miniconda3/lib/python2.7/site-packages/tensorflow/python/ops/array_ops.py", line 663, in strided_slice
shrink_axis_mask=shrink_axis_mask)
File "/home/ncarrara/miniconda3/lib/python2.7/site-packages/tensorflow/python/ops/gen_array_ops.py", line 3515, in strided_slice
shrink_axis_mask=shrink_axis_mask, name=name)
File "/home/ncarrara/miniconda3/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op
op_def=op_def)
File "/home/ncarrara/miniconda3/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2508, in create_op
set_shapes_for_outputs(ret)
File "/home/ncarrara/miniconda3/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1873, in set_shapes_for_outputs
shapes = shape_func(op)
File "/home/ncarrara/miniconda3/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1823, in call_with_requiring
return call_cpp_shape_fn(op, require_shape_fn=True)
File "/home/ncarrara/miniconda3/lib/python2.7/site-packages/tensorflow/python/framework/common_shapes.py", line 610, in call_cpp_shape_fn
debug_python_shape_fn, require_shape_fn)
File "/home/ncarrara/miniconda3/lib/python2.7/site-packages/tensorflow/python/framework/common_shapes.py", line 676, in _call_cpp_shape_fn_impl
raise ValueError(err.message)
ValueError: Shape must be rank 1 but is rank 2 for 'strided_slice' (op: 'StridedSlice') with input shapes: [?,3], [1,?], [1,?], [1].
That can be done using tf.gather_nd
:
import tensorflow as tf
Na = 3
x = tf.placeholder(tf.float32, [None, 2])
W = tf.Variable(tf.zeros([2, Na]))
b = tf.Variable(tf.zeros([Na]))
y = tf.nn.relu(tf.matmul(x, W) + b)
y_ = tf.placeholder(tf.float32, [None, 3])
k = tf.argmax(y_, 1)
# Make index tensor with row and column indices
num_examples = tf.cast(tf.shape(x)[0], dtype=k.dtype)
idx = tf.stack([tf.range(num_examples), k], axis=-1)
diff = tf.gather_nd(y, idx) - tf.gather_nd(y_, idx)
loss = tf.reduce_sum(tf.square(diff))
Explanation:
In this case, the idea of tf.gather_nd
is to make a matrix (a two-dimensional tensor) where each row contains the index of the row and column to have in the output. For example, if I have a matrix a
containing:
| 1 2 3 |
| 4 5 6 |
| 7 8 9 |
And a matrix i
containing:
| 1 2 |
| 0 1 |
| 2 2 |
| 1 0 |
Then the result of tf.gather_nd(a, i)
would be the vector (one-dimensional tensor) containing:
| 6 |
| 2 |
| 9 |
| 4 |
In this case, the column indices are given by tf.argmax
in k
; it tells you, for every row, which is the column with the highest value. Now you just need to put the row index with each of these. The first element in k
is the index of the max value column in row 0, the next element the one for row 1, and so on. num_examples
is just the number of rows in x
and tf.range(num_examples)
gives you then a vector from 0 to the number of rows in x
minus 1 (that is, all the sequence of row indices). Now you just need to put that together with k
, which is what tf.stack
does, and the result idx
is the argument for tf.gather_nd
.