python machine-learning scikit-learn digits

Predicting numbers using sklearn digits dataset - error

I want to build a simple digit prediction model.

Therefore I:

load in the sklearn dataset
Use the DecisionTreeClassifier()
Fit to the data
Predict the new image

import matplotlib.pyplot as plt 
from sklearn import datasets 
from sklearn import tree
digits = datasets.load_digits() 
clf = tree.DecisionTreeClassifier()
clf = clf.fit(digits.data, digits.target) 
clf.predict(digits.data[-1])

What did I do wrong?

ValueError                                Traceback (most recent call last)
<ipython-input-9-b58a2a08d39b> in <module>()
----> 1 clf.predict(digits.data[-1])

Solution

Your problem was that you were passing 1D array when the model requested a 2D array.

This should do the trick.

from sklearn import datasets
from sklearn import tree
from sklearn.model_selection import StratifiedKFold

# load the digits dataset
digits = datasets.load_digits()

# separate features and labels
X_digits = digits.data
y_digits = digits.target

# split data into training and testing sets
k_fold = StratifiedKFold(n_splits=10, shuffle=True, random_state=42)
for train_index, test_index in k_fold.split(X_digits, y_digits):
        train_features, test_features = X_digits[train_index], X_digits[test_index]
        train_labels, test_labels = y_digits[train_index], y_digits[test_index]

# fit to model
clf = tree.DecisionTreeClassifier()
clf = clf.fit(train_features, train_labels)

# predict on the testing features
print(clf.predict(test_features))

Also, have a look at this. It might provide you with further information.