I have a variable list of elements. I want to be able to find,
count
mean
std
min
25%
50%
75%
max
.
I know I can use pandas.Series.describe(). However, I have a restriction that I cannot use pandas for the specific problem. Is there any built in function/package that will give me the same output? Thanks.
As mentioned in the comments count
, min
, and max
are all built in so you can simply call count(your_list), max(your_list), min(your_list)
.
I would recommend using libraries such as Pandas, Numpy etc. if you can. If you are restricted only to the standard library you can also take a look at the statistics module.
For the others:
Mean
def mean(li):
return sum(li) / len(li)
Standard Deviation
def std(li):
mu = mean(li)
return (sum((x-mu)**2 for x in li)/len(li)) ** 0.5
Quartiles
I will implement for any percentile, you can then use percentile(your_list, 25)
or others as needed.
def percentile(li, percentile):
n = len(li)
idx = n * percentile / 100
return sorted(li)[math.floor(idx)]
If you want to replicate Pandas describe function:
def describe(li):
return f"""
count {len(li)}
mean {mean(li)}
std {std(li)}
min {min(li)}
25% {percentile(li, 25)}
50% {percentile(li, 50)}
75% {percentile(li, 75)}
max {max(li)}"""