Search code examples
pythonentropytheano

Entropy and Probability in Theano


I have a written a simple python code to calculate the entropy of a set and I am trying to write the same thing in Theano.

import math

# this computes the probabilities of each element in the set
def prob(values):
    return [float(values.count(v))/len(values) for v in values]

# this computes the entropy
def entropy(values):
    p = prob(values)
    return -sum([v*math.log(v) for v in p])

I am trying to write the equivalent code in Theno, but I am not sure how to do it:

import theano
import theano.tensor as T

v = T.vector('v') # I create a symbolic vector to represent my initial values
p = T.vector('p') # The same for the probabilities 

# this is my attempt to compute the probabilities which would feed vector p
theano.scan(fn=prob,outputs_info=p,non_sequences=v,n_steps=len(values))

# considering the previous step would work, the entropy is just
e = -T.sum(p*T.log(p))
entropy = theano.function([values],e)

However, the scan line is not correct and I get tons of errors. I am not sure if there is a simple way to do it (to compute the entropy of a vector), or if I have to put more effort on the scan function. Any ideas?


Solution

  • Other than the point raised by nouiz, P should not be declared as a T.vector because it will be the result of computation on your vector of values.

    Also, to compute something like entropy, you do not need to use Scan (Scan introduces a computation overhead so it should only be used because there's no other way of computing what you want or to reduce memory usage); you can take a approach like this :

    values = T.vector('values')
    nb_values = values.shape[0]
    
    # For every element in 'values', obtain the total number of times
    # its value occurs in 'values'.
    # NOTE : I've done the broadcasting a bit more explicitly than
    # needed, for clarity.
    freqs = T.eq(values[:,None], values[None, :]).sum(0).astype("float32")
    
    # Compute a vector containing, for every value in 'values', the
    # probability of that value in the vector 'values'.
    # NOTE : these probabilities do *not* sum to 1 because they do not
    # correspond to the probability of every element in the vector 'values
    # but to the probability of every value in 'values'. For instance, if
    # 'values' is [1, 1, 0] then 'probs' will be [2/3, 2/3, 1/3] because the
    # value 1 has probability 2/3 and the value 0 has probability 1/3 in
    # values'.
    probs = freqs / nb_values
    
    entropy = -T.sum(T.log2(probs) / nb_values)
    fct = theano.function([values], entropy)
    
    # Will output 0.918296...
    print fct([0, 1, 1])