Search code examples
pythonpandasdata-analysis

Python Pandas hypostesis: average rating for the "expensive" books. Need some help understatding the basic features of pandas


I'm studying pandas now and having issues in understanding of basic features. I'm exploring this data set. There's a variable "Price (Above Average)" that contains "Yes" if the price of the book is greater than the average, and "No" if it is less.

I assumed that a book's rating is independent of its price and want to test it. Now I need graphing the average user rating for each of the groups.

At first I want to print the average rating for the "expensive" books just to figure out how it works. I don't understand the syntax very well yet, so I'm hoping on your help.


Solution

  • To print the average rating for the books:

    df['average_rating_for_books'] = df.groupby(['Price (Above Average)'])['User Rating (Round)'].transform('mean')
    

    After this, you can filter out books which are expensive.

    To filter out rows, you can write a function like:

    df[df['Price (Above Average)'] == 'Yes']