I am trying to query a dataframe for it's values. My data consists of 6 columns G-p1,G-p2,G-c, H-p1,H-p2, H-c. The values for all the columns are either 'left' or 'right' as they stand for whether a parent/child has left or right handed genotype or handedness. I want to query the values where the handedness of the parents and child are left. I've tried:
test1 = pd.DataFrame(data)
test1 = test1.query({
'H-p1': 'left',
'H-p2': 'left',
'H-c': 'left'})
train_data = test1
predict_data = test1
model.fit(test1)
predict_data = predict_data.copy()
predict_data.drop('H-p1', axis=1, inplace=True)
predict_data.drop('H-p2', axis=1, inplace=True)
predict_data.drop('H-c', axis=1, inplace=True)
pred = model.predict_probability(predict_data)
print(pred.to_string())
But I get this error:
ValueError: expr must be a string to be evaluated, <class 'dict'> given
Any suggestions? Thank you!
query
method receives a string exprpession similar to what you would use to loc filter.
Try this:
test1 = test1.query("`H-p1` == 'left' and `H-p2` == 'left' and `H-c` == 'left'")
train_data = test1
backticks ``
are used to specify column names.