I have a dataframe of precounted data (shown below). Let's assume it's a "Do you like?" scale, where 4 people answered 1-Don't like at all, 10 people answer 2-Don't like and so on.
How can I compute the different statistical values? I want to compute the mean (in this case, it can be done by hand (4*1+10*2+125*3+85*4+25*5)/(4+10+125+85+25)=3.47)
and the standard deviation
df=pd.DataFrame({1:4,2:10,3:125,4:85,5:25})
You can create dataframe as much as counted person who rated data for every rated score. Then you can use pandas.DataFrame.describe()
. This function give many statistics information.
import pandas as pd
# Given dictionary
data = {1: 4, 2: 10, 3: 125, 4: 85, 5: 25}
# Create a list of ratings based on the counts
ratings = []
for rating, count in data.items():
ratings.extend([rating] * count)
# Create a DataFrame
df = pd.DataFrame(ratings, columns=['Rating'])
df.describe()