I am trying to map rows in a 2d to list to elements in a list of labels with Scikit-learn.
For example:
from sklearn import tree
clf = DecisionTreeClassifier()
#2D list of training data:
training_data = [[1, 2, 3], [1, 2, 4, 5, 6], [5, 7], [1, 2, 3]]
#1D list of training labels:
training_labels = ['a', 'b', 'c', 'a']
clf = clf.fit(training_data, training_labels)
When I run the code, I get "ValueError: setting an array element with a sequence."
I am wondering how to properly transform the data so that I can fit the test data with training labels.
testing_data = [[1, 2, 3], [1, 2, 4, 5, 6], [5, 7], [1, 2, 3]]
Here if each sublist is considered a sample, then you do not have the same dimensions per sample. In that case, it is impossible to fit the model.
Also probably you mean:
training_labels = ["a", "b", "c", "a"]
Otherwise, a,b,c
should be defined variables