Search code examples
pythonpandasscalecategoriesboolean-logic

Pandas Scaling as category not returning expected result when comparing variables


I'm doing a few coursera courses and on one of them I have to use pandas astype function to categorize some values within a dataframe. As part of the exercise I have to compare grades to see if the astype function did indeed put them in order, the given exercise works but the one that I developed later don't. Here are the codes:
Working Code

import pandas as pd
import numpy as np
df = pd.DataFrame(['A+', 'A', 'A-', 'B+', 'B', 'B-', 'C+', 'C', 'C-', 'D+', 'D'],
                  index=['excellent', 'excellent', 'excellent', 'good', 'good', 'good', 'ok', 'ok', 'ok', 'poor', 'poor'])
df.rename(columns={0: 'Grades'}, inplace=True)
grades = df['Grades'].astype('category',
                         categories=['D', 'D+', 'C-', 'C', 'C+', 'B-', 'B', 'B+', 'A-', 'A', 'A+'],
                         ordered=True)
grades > 'C'


Which returns:

excellent     True
excellent     True
excellent     True
good          True
good          True
good          True
ok            True
ok           False
ok           False
poor         False
poor         False
Name: Grades, dtype: bool


My code

s = pd.Series(['Low', 'Low', 'High', 'Medium', 'Low', 'High', 'Low'])
s.astype('category', categories=['Low', 'Medium', 'High'], ordered=True)
s>'Low'


Which returns:

0    False
1    False
2    False
3     True
4    False
5    False
6    False
dtype: bool



As you can see when he does the comparison of 'High'>'Low' it returns 'False'. Am I doing something wrong? Did I lose any concept? Thank you.


Solution

  • You forget assign output:

    print (s > 'Low')
    0    False
    1    False
    2    False
    3     True
    4    False
    5    False
    6    False
    dtype: bool
    
    s = s.astype('category', categories=['Low', 'Medium', 'High'], ordered=True)
    
    print (s > 'Low')
    0    False
    1    False
    2     True
    3     True
    4    False
    5     True
    6    False
    dtype: bool