I have the following table:
perc
0 59.98797
1 61.89383
2 61.08403
3 61.00661
4 62.64753
5 62.18118
6 60.74520
7 57.83964
8 62.09705
9 57.07985
10 58.62777
11 60.02589
12 58.74948
13 59.14136
14 58.37719
15 58.27401
16 59.67806
17 58.62855
18 58.45272
19 57.62186
20 58.64749
21 58.88152
22 54.80138
23 59.57697
24 60.26713
25 60.96022
26 55.59813
27 60.32104
28 57.95403
29 58.90658
30 53.72838
31 57.03986
32 58.14056
33 53.62257
34 57.08174
35 57.26881
36 48.80800
37 56.90632
38 59.08444
39 57.36432
consisting of various percentages.
I'm interested in creating a probability distribution based on these percentages for the sake of coming up with a confidence interval (say 95%) of what we would expect this percentage to be.
With percDone['perc'].plot.density()
, I can get a nice density plot, but I don't know how to extract CIs from this. I also considered making a Weighted KDE, but again, I don't know how to get CIs. How should I go about finding my confidence interval?
Thanks!
Try this using scipy package & numpy;
import scipy.stats as st
import numpy as np
# Get data in a list
lst = list(percDone['perc'])
# create 95% confidence interval
st.t.interval(alpha=0.95, df=len(lst)-1,
loc=np.mean(lst),
scale=st.sem(lst))
For Kernel Density; Go through the documentation https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.KernelDensity.html#sklearn.neighbors.KernelDensity
Hope this Helps...