python scikit-learn sklearn-pandas onnx onnxruntime

Error Making prediction with python onnxruntime

I have created an very basic decision tree using the sklearn library. This tree is trained based on 4 features:

feat1 INT
feat2 INT
feat3 FLOAT
feat4 FLOAT

And the label/target feature is a boolean value (0 or 1).

I converted the tree into a ONNX format and now I want to use the onnxruntime python library to make a prediction. I have found example code on the internet to do this. The problem is I dont understand exactly what exacly happens in all parts of this code, functions and parameters. This leads to me getting an error. I did search for some documentation, but I cant find this.

In below code I convert the tree model to ONNX format. This is succesfull but parts of the code I dont understand. In the initial_type variable, what do I have to enter here based on the 4 feature columns and label/target feature I mensioned earlier? Now I have entered FloatTensorType([None, 4] because I have 4 feature columns and what the None does I have no idea.

##Convert to ONNX format

initial_type = [('float_input', FloatTensorType([None, 4]))]
onx = convert_sklearn(treeModel, initial_types=initial_type)
with open("path", "wb") as f:
    f.write(onx.SerializeToString())

In below code I want to make a prediction using the onnxruntime library but I get this error:

RuntimeError: Either type_proto was null or it was not of sequence type

This is because I dont understand the last line of code below. I entered this {input_name: [4, 8, 77.8, 143.45] because this are four values for the feature columns. What am I doing wrong here?

sess = rt.InferenceSession("pathToONNXModel")
input_name = sess.get_inputs()[0].name
label_name = sess.get_outputs()[0].name
pred_onx = sess.run([label_name], {input_name: [4, 8, 77.8, 143.45]})[0]

Solution

Did you try {input_name: numpy.array([4, 8, 77.8, 143.45], dtype=numpy.float32)}? onnxruntime requires numpy arrays as inputs.