What I'm trying to do is build a regressor based on a value in a feature.
That is to say, I have some columns where one of them is more important (let's suppose it is gender
) (of course it is different from the target value Y).
I want to say:
- If the gender
is Male then use the randomForest regressor
- Else use another regressor
Do you have any idea about if this is possible using sklearn
or any other library in python?
You might be able to implement your own regressor. Let us assume that gender
is the first feature. Then you could do something like
class MyRegressor():
'''uses different regressors internally'''
def __init__(self):
self.randomForest = initializeRandomForest()
self.kNN = initializekNN()
def fit(self, X, y):
'''calls the appropriate regressors'''
X1 = X[X[:,0] == 1]
y1 = y[X[:,0] == 1]
X2 = X[X[:,0] != 1]
y2 = y[X[:,0] != 1]
self.randomForest.fit(X1, y1)
self.kNN.fit(X2, y2)
def predict(self, X):
'''predicts values using regressors internally'''
results = np.zeros(X.shape[0])
results[X[:,0]==1] = self.randomForest.predict(X[X[:,0] == 1])
results[X[:,0]!=1] = self.kNN.predict(X[X[:,0] != 1])
return results