python tensorflow tensorflow-decision-forests

TensorFlow random forest get label as prediction output

I'm using TensorFlow decision forest to predict the suitable crop based on few parameters. How do i get the predict() method to return the label ?

Im using this dataset for training

My code

import tensorflow_decision_forests as tfdf
import tensorflow as tf
import pandas as pd
import numpy as np

df = pd.read_csv("Crop_recommendation.csv")

#TensorFlow dataset
train_ds = tfdf.keras.pd_dataframe_to_tf_dataset(df,label="label")

# Train the model
model = tfdf.keras.RandomForestModel()
model.fit(train_ds)
print(model.summary())

pd_serving_dataset = pd.DataFrame({
    "N": [83],
    "P": [45],
    "K" : [30],
    "temperature" : [25],
    "humidity" : [80.3],
    "ph" : [6],
    "rainfall" : [200.91],
})

tf_serving_dataset = tfdf.keras.pd_dataframe_to_tf_dataset(pd_serving_dataset)
prediction = model.predict(tf_serving_dataset)

print(prediction)

My Output

1/1 [==============================] - 0s 38ms/step
[[0.         0.         0.         0.         0.02333334 0.07666666
  0.04666667 0.         0.08333332 0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.7699994  0.        ]]

Expected Output rice

Solution

For a classification problem, Tensorflow Decision Forests returns the probabilities for each class as a numpy array. If you need the class names, you have to find the class with the highest probability and map it back to its name.

Since Keras expects class names to be integers, TF-DF converts them silently during the tfdf.keras.pd_dataframe_to_tf_dataset by sorting and mapping to integers. To get the names, you therefore have to revert the mapping. Overall, you would get

classification_names = df["label"].unique().tolist().sort()
prediction = model.predict(tf_serving_dataset)
class_predictions = list(map(lambda x: classification_names[x] , list(np.argmax(prediction, axis=1))))
# Since you only predicted a single example, class_predictions = ['rice']

Warning: TF-DF might change the way tfdf.keras.pd_dataframe_to_tf_dataset maps classes to integers at some point in the future. It would be more prudent to perform the mapping yourself by preprocessing the pandas dataframe with

classes = df[label].unique().tolist()
print(f"Label classes: {classes}")

dataset_df[label] = df[label].map(classes.index)

For more information on how to make (fast) predictions with TF-DF, you can also check out the TF-DF predictions colab.