Search code examples
scikit-learnclassificationrandom-foresttext-classification

TypeError: len() of unsized object


I am trying random forest classifier from sklearn, when i want to print the classifier report, it is give me an error.

This was the code :

randomforestmodel = RandomForestClassifier()
randomforestmodel.fit(train_vectors, data_train['label'])
predict_rfmodel = randomforestmodel.predict(test_vectors)

print("classification with randomforest")
print(metrics.classification_report(test_vectors, predict_rfmodel))

And the error was like this :

    classification with randomforest
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-34-f976cec884e4> in <module>()
      1 print("classification with randomforest")
----> 2 print(metrics.classification_report(test_vectors, predict_rfmodel))

2 frames
/usr/local/lib/python3.7/dist-packages/sklearn/metrics/_classification.py in classification_report(y_true, y_pred, labels, target_names, sample_weight, digits, output_dict, zero_division)
   2108     """
   2109 
-> 2110     y_type, y_true, y_pred = _check_targets(y_true, y_pred)
   2111 
   2112     if labels is None:

/usr/local/lib/python3.7/dist-packages/sklearn/metrics/_classification.py in _check_targets(y_true, y_pred)
     83     """
     84     check_consistent_length(y_true, y_pred)
---> 85     type_true = type_of_target(y_true)
     86     type_pred = type_of_target(y_pred)
     87 

/usr/local/lib/python3.7/dist-packages/sklearn/utils/multiclass.py in type_of_target(y)
    308 
    309     # Invalid inputs
--> 310     if y.ndim > 2 or (y.dtype == object and len(y) and not isinstance(y.flat[0], str)):
    311         return "unknown"  # [[[1, 2]]] or [obj_1] and not ["label_1"]
    312 

TypeError: len() of unsized object

Solution

  • You're providing the test instances features (test_vectors) instead of the true test instances labels to classification_report.

    As per the documentation, the first parameter should be:

    y_true: 1d array-like, or label indicator array / sparse matrix.

    Ground truth (correct) target values.