My current code using sparse_softmax_cross_entropy
works fine.
loss_normal = (
tf.reduce_mean(tf.losses
.sparse_softmax_cross_entropy(labels=labels,
logits=logits,
weights=class_weights))
)
However, when I try to use the hinge_loss
:
loss_normal = (
tf.reduce_mean(tf.losses
.hinge_loss(labels=labels,
logits=logits,
weights=class_weights))
)
It reported an error saying:
ValueError: Shapes (1024, 2) and (1024,) are incompatible
The error seems to be originated from this function in the losses_impl.py
file:
with ops.name_scope(scope, "hinge_loss", (logits, labels)) as scope:
...
logits.get_shape().assert_is_compatible_with(labels.get_shape())
...
I modified my code as below to just extract 1 column of the logits tensor:
loss_normal = (
tf.reduce_mean(tf.losses
.hinge_loss(labels=labels,
logits=logits[:,1:],
weights=class_weights
))
)
But it still reports a similar error:
ValueError: Shapes (1024, 1) and (1024,) are incompatible.
Can someone please help point out why my code works fine with sparse_softmax_cross_entropy
loss but not hinge_loss
?
The tensor labels
has the shape [1024]
, the tensor logits
has [1024, 2]
shape. This works fine for tf.nn.sparse_softmax_cross_entropy_with_logits
:
- labels: Tensor of shape [d_0, d_1, ..., d_{r-1}] (where r is rank of labels and result) and dtype int32 or int64. Each entry in labels must be an index in [0, num_classes). Other values will raise an exception when this op is run on CPU, and return NaN for corresponding loss and gradient rows on GPU.
- logits: Unscaled log probabilities of shape [d_0, d_1, ..., d_{r-1}, num_classes] and dtype float32 or float64.
But tf.hinge_loss
requirements are different:
- labels: The ground truth output tensor. Its shape should match the shape of logits. The values of the tensor are expected to be 0.0 or 1.0.
- logits: The logits, a float tensor.
You can resolve this in two ways:
Reshape the labels to [1024, 1]
and use just one row of logits
, like you did - logits[:,1:]
:
labels = tf.reshape(labels, [-1, 1])
hinge_loss = (
tf.reduce_mean(tf.losses.hinge_loss(labels=labels,
logits=logits[:,1:],
weights=class_weights))
)
I think you'll also need to reshape the class_weights
the same way.
Use all of learned logits
features via tf.reduce_sum
, which will make a flat (1024,)
tensor:
logits = tf.reduce_sum(logits, axis=1)
hinge_loss = (
tf.reduce_mean(tf.losses.hinge_loss(labels=labels,
logits=logits,
weights=class_weights))
)
This way you don't need to reshape labels
or class_weights
.