I am trying to calculate continuous entropy
over a tensor which I received -inf
most of the times as my numbers are too small in the tensor:
tf.Tensor([-inf -inf -inf -inf -inf -inf -inf], shape=(7,), dtype=float32)
And this is a subsample of tensor I have:
tf_ent = tf.constant([ [0.096, -0.00065, 0.99, 0.01 ],
[0.097, 0.033, 0.025, 0.005 ],
[0.009, 0.0001, 0.0009, 0.0033],
[0.00060, 0.001, 0.03, 0.0005 ],
[0.0049, -0.08, -0.05, -0.00036],
[0.09 , -0.45, 0.087, 0.0023 ],
[0.3, -0.23, 0.82, -0.28 ]])
And this is the way I calculate continuous entropy on each
sample of this tensor:
import tensorflow as tf
import math
tf.enable_eager_execution()
def score(X):
def entropy(data):
if data is not None:
data = tf.reshape(data, shape=[1, -1])
num_samples = data.shape[0]
if len(data.shape) == 1:
num_dimensions = 1
else:
num_dimensions = data.shape[1]
detCov = tf.linalg.det(tf.cast(tf.matmul(data, tf.transpose(data)),tf.int32)/tf.cast(num_samples,tf.int32))
normalization = tf.math.pow(tf.cast((tf.math.multiply(2, tf.math.multiply(np.pi, tf.math.exp(1.0)))),tf.int32), num_dimensions)
if detCov == 0:
return -np.inf
else:
return 0.5 * tf.math.log(tf.math.multiply(tf.cast(normalization,tf.float32),tf.cast(detCov, tf.float32)))
rev = tf.map_fn(entropy, X, dtype=tf.float32)
return rev
ent_p = score(tf_ent)
So my question here is that, Is it ok if I multiple all the elements in the tensor by say 10000
, So I will get a score per row for the most of my rows?
Or it might not make sense conceptually?
I'm sure you realise that you are seeing this behaviour because you are passing a very small number into a log function. The same thing can happen if we try to divide by a very small number. After accounting for the numerical accuracy limits of a float32
(or whichever dtype) then we end up dividing by exactly zero.
The most common approach to avoid this issue (used in the majority or maybe even all of the 'out of the box' loss functions) is to add a very small constant value (commonly called epsilon
) when we take a log or divide. The principle is that the epsilon is small enough to be negligible in terms of how much it changes the loss value but that it is large enough to never have to actually divide by zero.
So maybe change to something like this:
def score(X): def entropy(data): epsilon = tf.constant(0.000001) if data is not None: data = tf.reshape(data, shape=[1, -1]) num_samples = data.shape[0] if len(data.shape) == 1: num_dimensions = 1 else: num_dimensions = data.shape[1] detCov = tf.linalg.det(tf.cast(tf.matmul(data, tf.transpose(data)),tf.int32)/tf.cast(num_samples,tf.int32)) normalization = tf.math.pow(tf.cast((tf.math.multiply(2, tf.math.multiply(np.pi, tf.math.exp(1.0)))),tf.int32), num_dimensions) if detCov == 0: return -np.inf else: return 0.5 * tf.math.log(epsilon + tf.math.multiply(tf.cast(normalization,tf.float32),tf.cast(detCov, tf.float32))) rev = tf.map_fn(entropy, X, dtype=tf.float32) return rev ent_p = score(tf_ent)