Search code examples
pythonpandasdataframepandas-loc

Access a row from a df based on a column value


I am trying to find out the rows where the reliability is <0.70, but the output seems to include rows where Reliability is 0.70 as well. What could be wrong?


Original DF:


po_id po_name product year measure rate denominator numerator is_reported reliability


0 1051408 Aberdeen Care Alliance Commercial HMO/POS 18 CHLAMSCR 67.740000 62.0 42.0 True NaN 1 1051408 Aberdeen Care Alliance Commercial HMO/POS 19 AMROV64 80.000000 20.0 16.0 True NaN 2 1051408 Aberdeen Care Alliance Commercial HMO/POS 19 CISCOMBO10 17.650000 34.0 6.0 True NaN 3 1051408 Aberdeen Care Alliance Commercial HMO/POS 19 OFCSTAFF 69.440000 NaN NaN True 0.76 4 1051408 Aberdeen Care Alliance Commercial HMO/POS 18 BCS5274 86.420000 302.0 261.0 True NaN 5 1051408 Aberdeen Care Alliance Commercial HMO/POS 19 SPD1 57.810000 64.0 37.0 True NaN 6 1051408 Aberdeen Care Alliance Commercial HMO/POS 19 PDCS 79.530000 127.0 101.0 True NaN 7 1051408 Aberdeen Care Alliance Commercial HMO/POS 19 TCOC_250K_GEO_RISKADJ 289.281096 NaN NaN False NaN 8 1051408 Aberdeen Care Alliance Commercial HMO/POS 19 CBPD4 67.440000 129.0 87.0 True NaN 9 1051408 Aberdeen Care Alliance Commercial HMO/POS 18 COORDINATE3 55.370000 NaN NaN True 0.74


Code added to locate where reliability is less than 0.70 awards_test_df.loc[awards_test_df['reliability'] <0.70]


Output:

po_id   po_name product year    measure rate    denominator numerator is_reported   reliability

191 1008200 Advancements Physicians Medical Center  Commercial HMO/POS  18  ACCESS3 58.13   NaN NaN True    0.60
515 1021102 Baird Medical Group Commercial HMO/POS  18  COORDINATE3 60.02   NaN NaN True    0.70
... ... ... ... ... ... ... ... ... ... ...
8606    1038400 Vf Healthcare   Commercial HMO/POS  18  OFCSTAFF    68.78   NaN NaN True    0.70
8620    1038400 Vf Healthcare   Commercial HMO/POS  18  MDINTERACT3 79.57   NaN NaN True    0.70
8800    1006001 Viva Physicians Commercial HMO/POS  18  ACCESS3 66.25   NaN NaN True    0.70
8869    1017708 Waltz Hospital  Commercial HMO/POS  19  MDINTERACT3 81.01   NaN NaN True    0.70
9142    1028100 Zeke Medical Group  Commercial HMO/POS  18  ACCESS3 56.37   NaN NaN True    0.70

Solution

  • It's just display formats. Try below to check

    df = pd.DataFrame({"reliability":np.random.uniform(.65,.75, 100)})
    
    df = df.loc[df.reliability.lt(.7)].assign(twodp=df.reliability.round(2)).query("twodp.eq(.7)")
    
    
    
    reliability twodp
    0.695661 0.7
    0.699588 0.7
    0.698993 0.7
    0.697933 0.7
    0.698356 0.7
    0.699906 0.7
    0.695279 0.7