Search code examples
pythonarraysnumpyprobability-density

Python - calculating pdf from a numpy array distribution


Given an array of values, I want to be able to fit a density function to it and find the pdf of an arbitrary input value. Is this possible, and how would I go about it? There aren't necessarily assumptions of normality, and I don't need the function itself.

For instance, given:

x = array([ 0.62529759, -0.08202699,  0.59220673, -0.09074541,  0.05517865,
        0.20153703,  0.22773723, -0.26229708,  0.76137555, -0.61229314,
        0.27292745,  0.35596795, -0.01373896,  0.32464979, -0.22932331,
        1.14796175,  0.17268531,  0.40692172,  0.13846154,  0.22752953,
        0.13087359,  0.14111479, -0.09932381,  0.12800392,  0.02605917,
        0.18776078,  0.45872642, -0.3943505 , -0.0771418 , -0.38822433,
       -0.09171721,  0.23083624, -0.21603973,  0.05425592,  0.47910286,
        0.26359565, -0.19917942,  0.40182097, -0.0797546 ,  0.47239264,
       -0.36654449,  0.4513859 , -0.00282486, -0.13950512, -0.05375369,
        0.03331833,  0.48951555, -0.13760504,  2.788     , -0.15017848,
        0.02930675,  0.10910646,  0.03868301, -0.048482  ,  0.7277376 ,
        0.08841259, -0.10968462,  0.50371324,  0.86379698,  0.01674877,
        0.19542421, -0.06639165,  0.74500856, -0.10148342,  0.02482331,
        0.79195804,  0.40401969,  0.25120005,  0.21020794, -0.01767013,
       -0.13453783, -0.09605592, -0.88044229,  0.04689623,  0.09043851,
        0.21232286,  0.34129982, -0.3736799 ,  0.17313858])

I would like to find how a value of 0.3 compares to all of the above, and what percent of the above values it is greater than.


Solution

  • You can use openTURNS for that. You can use a Gaussian kernel smoothing to do that easily! From the doc:

    import openturns as ot 
    kernel = ot.KernelSmoothing()
    estimated = kernel.build(x)
    

    That's it, now you have a distribution object :)

    This library is very cool for statistics! (I am not related to them).