Search code examples
rstatisticsentropy

Entropy calculation gives NaN - is applying na.omit a valid tweak?


By definition, the entropy is defined as:

entropy <- function (p) sum(-p * log(p))

I'm performing LCA using the poLCA package and trying to calculate entropy, which for some of my models are outputting NaN.

error_prior <- entropy(lca2$P) # Class proportions model 2
error_post <- mean(apply(lca2$posterior, 1, entropy), na.rm = TRUE)
results[2,8] <- round(((error_prior - error_post) / error_prior), 3)

From the answer to this question: Entropy output is NaN for some class solutions and not others, I learnt that it is caused by zeros in p and it can be resolved by adding na.omit to the function as follows:

entropy <- function (p) sum(na.omit(-p * log(p)))

My question is - is this technical tweak mathematically valid without affecting the integrity of the calculation?

In my case, around 1/3 of the values in p are zeros. I'm really unsure if I should use na.omit or find another way to resolve this problem.


Solution

  • It is valid, but not transparent at first glance. The reason is that the mathematical limit of xlog(x) as x -> 0 is 0 (we can prove this using L'Hospital Rule). In this regard, the most robust definition of the function should be

    entropy.safe <- function (p) {
      if (any(p > 1 | p < 0)) stop("probability must be between 0 and 1")
      log.p <- numeric(length(p))
      safe <- p != 0
      log.p[safe] <- log(p[safe])
      sum(-p * log.p)
    }
    

    But simply dropping p = 0 cases gives identical results, because the result at p = 0 is 0 and contributes nothing to the sum anyway.

    entropy.brutal <- function (p) {
      if (any(p > 1 | p < 0)) stop("probability must be between 0 and 1")
      log.p <- log(p)
      ## as same as sum(na.omit(-p * log.p))
      sum(-p * log.p, na.rm = TRUE)
    }
    
    ## p has a single 0
    ( p <- seq(0, 1, by = 0.1) )
    #[1] 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
    entropy.brutal(p)
    #[1] 2.455935
    entropy.safe(p)
    #[1] 2.455935
    
    ## half of p are zeros
    p[1:5] <- 0
    p
    #[1] 0.0 0.0 0.0 0.0 0.0 0.5 0.6 0.7 0.8 0.9 1.0
    entropy.brutal(p)
    #[1] 1.176081
    entropy.safe(p)
    #[1] 1.176081
    

    In conclusion, we can use either entropy.brutal or entropy.safe.