Search code examples
pythondeep-learningdata-sciencevalueerrork-fold

"ValueError: Supported target types are: ('binary', 'multiclass'). Got 'unknown' instead." in dataset kfold split


I have encountered this error

"ValueError: Supported target types are: ('binary', 'multiclass'). Got 'unknown' instead." 

while running this python code line 5

1    print(data.datasetsNames)
2    for dataset in data.datasetsNames:
3       X, Y, dictActivities = data.getData(dataset)
4
5       for train, test in kfold.split(X, Y):
.
.
.
.
10 def getData(datasetName):
11    X = np.load('./npy/' + datasetName + '-x.npy')
12    Y = np.load('./npy/' + datasetName + '-y.npy')
13    dictActivities = np.load('./npy/' + datasetName + '-labels.npy').item()
14   return X, Y, dictActivities

Y is output of getdata function and the result is a 1d array which its variables is in range 0 to 6. Y=[1,2,5,0,0,0,6]

I checked with the bellow code the target type for X and Y:

X was multiclass-multioutput Y was unknown.

from sklearn.utils.multiclass import type_of_target

print(type_of_target(X))
print(type_of_target(Y))

I read somewhere that the label_encoder can solve the error but I could not to solve it.

from sklearn.preprocessing import LabelEncoder

label_encoder = LabelEncoder()
y = label_encoder.fit_transform(target_labels)

any help please.....Thanks

The src code is hear : https://github.com/danielelic/deep-casas/blob/master/train.py


Solution

  • As the first comment says, you need to figure out what type Y is. I downloaded the referenced code from github, ran portions of it, and it turns out Y is of type <class 'int'>. Apparently that is not supported by current versions of sklearn.model_selection.StratifiedKFold, which is what your kfold object is. The following will allow you to proceed. Add the statement Y = np.array(Y, dtype=np.int) after your getdata() call, and the error should go away.