I want to delete in X_test and in y_test every row where MFD is bigger one. The problem is, that i always get the random mixed indices from Train / Test / Split. If i try to drop it i get the following Error Message:
IndexError: index 3779 is out of bounds for axis 1 with size 3488
I cant use the old indices to drop it, but how can i get the new ones where MFD > 1
X_train, X_test, y_train, y_test = train_test_split(X, y,
test_size=test_size,
random_state=random_state,
stratify=y)
mfd_drop_rows = []
i_nr = 0
for i in X_test.MFD:
if (i > 1):
mfd_drop_rows.append(X_test.index[i_nr])
i_nr += 1
X_test_new = X_test.drop(X_test.index[mfd_drop_rows])
y_test_new = Y_test.drop(Y_test.index[mfd_drop_rows])
Thanks for your help ( =
Not sure what MFD is but assuming that X_test.MFD
gives you an array of numbers you could use a mask to drop rows. A simple example of how to use a mask can be seen here:
x = [[1,2,3,4,5],[6,7,8,9,10]]
mfd = [0.6, 1.3]
mask = x > 1
x_new = x[mask,:]
This would give:
x = [1,2,3,4,5
6,7,8,9,10]
mask = [False, True]
x_new = [6,7,8,9,10]