Search code examples
pandasdataframeprediction

ValueError: Data must be 1-dimensional


When I try to create a data frame to compare actual (test data) and predicted values of the classification model, I get an error. You could see the error message below the code. The implementation of the code looks like this:

     df = pd.DataFrame({'Actual': y_test, 'Predicted': y_pred})
     df
     
     ValueError                                Traceback (most recent call last)
    <ipython-input-22-45d19b608f98> in <module>
---->  df = pd.DataFrame({'Actual': y_test, 'Predicted': y_pred})
     ValueError: Data must be 1-dimensional

Edit: The sample content of y_test and y_pred along with their data types are as follows

     y_test
Out:array([[0],
   [0],
   [0],
   [0],
   [0],
   [1],
   [6],
   [0],
   [9],
   [0],
   [0],
   [5],
   [0],
   [6],
   [7],
   [0],
   [0],
   [9],
   [5],
   [0],
   [0],
   [0],
   [0],
   [5],
   [0],
   [3],
   [7],
   [0],
   [0],
   [9],
   [9],
   [0],
   [9],
   [7],
   [0],
   [0],
   [0],
   [0],
   [0],
   [9],
   [0],
   [8],
   [7],
   [9],
   [7],
   [5],
   [9],
   [0],
   [0],
   [0],
   [0],
   [0],
   [7],
   [0],
   [0],
   [0],
   [7],
   [3],
   [4],
   [5],
   [1],
   [8],
   [0],
   [9],
   [0],
   [0],
   [0],
   [0],
   [0],
   [8],
   [7],
   [0],
   [0],
   [0],
   [7],
   [5],
   [8],
   [7]])

  y_test.dtype
  Out:dtype('int32')

  y_pred
  Out:array([0, 7, 0, 3, 8, 0, 8, 7, 8, 0, 5, 7, 0, 0, 7, 0, 8, 0, 7, 0, 0, 
      0,3, 9, 0, 5, 0, 7, 9, 7, 0, 5, 5, 7, 0, 0, 0, 0, 9, 8, 7, 8, 1, 5,
   0, 0, 0, 0, 0, 0, 7, 0, 5, 0, 8, 9, 9, 0, 9, 0, 0, 9, 0, 9, 0, 3,
   0, 0, 9, 0, 0, 0, 0, 0, 0, 7, 0, 7])

  y_pred.dtype
  Out:dtype('int32')
  

How can I solve this issue?


Solution

  • y_test is an array with arrays inside of it so not a 1d vector. use y_test.flatten to flatten it to a 1d array:

    df = pd.DataFrame({'Actual': y_test.flatten(), 'Predicted': y_pred})