Search code examples
pythonentropy

Compute entropy of a set of points


I have a set of points that lie on a n-dimensional sphere (hence, they have norm one). I need to compute the entropy of this set. Is there a tool which allows me to do something like that? Otherwise, how can I do it?

My guess is that, since the entropy has to be computed from a probability distribution, I need to divide this problem in 2:

1) Some function which takes the set of points and outputs a probability distribution which approximates this set.

2) A function which takes the probability distribution (or its density) and gives me its entropy.

All of this while knowing that the points lie on the n-sphere. Thanks!


Solution

  • I am not sure how much data you exactly have. But based on your explanation, I have the following answer. I will suggest to use Kernel Density Estimation to estimate the pdf. This module is available in scipy and scikit-learn.

    In scikit-learn, you have more options to choose for kernel to be used in KDE.

    import numpy as np
    from sklearn.neighbors.kde import KernelDensity
    
    ### create data ###
    sample_count = 1000
    n = 6
    data = np.random.randn(sample_count, n)
    data_norm = np.sqrt(np.sum(data*data, axis=1))
    data = data/data_norm[:, None]   # Normalized data to be on unit sphere
    
    
    ## estimate pdf using KDE with gaussian kernel
    kde = KernelDensity(kernel='gaussian', bandwidth=0.2).fit(data)
    
    log_p = kde.score_samples(data)  # returns log(p) of data sample
    p = np.exp(lop_p)                # estimate p of data sample
    entropy = -np.sum(p*lop_p)       # evaluate entropy