I am trying to practice my machine learning skills with Tensorflow/Keras but I am having trouble around fitting the model. Let me explain what I've done and where I'm at.
I am using the dataset from Kaggle's Costa Rican Household Poverty Level Prediction Challenge
Since I am just trying to get familiar with the Tensorflow workflow, I cleaned the dataset by removing a few columns that had a lot of missing data and then filled in the other columns with their mean. So there are no missing values in my dataset.
Next I loaded the new, cleaned, csv in using make_csv_dataset
from TF.
batch_size = 32
train_dataset = tf.data.experimental.make_csv_dataset(
'clean_train.csv',
batch_size,
column_names=column_names,
label_name=label_name,
num_epochs=1)
I set up a function to return my compiled model like so:
f1_macro = tfa.metrics.F1Score(num_classes=4, average='macro')
def get_compiled_model():
model = tf.keras.Sequential([
tf.keras.layers.Dense(512, activation=tf.nn.relu, input_shape=(137,)), # input shape required
tf.keras.layers.Dense(256, activation=tf.nn.relu),
tf.keras.layers.Dense(4, activation=tf.nn.softmax)
])
model.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=[f1_macro, 'accuracy'])
return model
model = get_compiled_model()
model.fit(train_dataset, epochs=15)
Below is the result of that
A link to my notebook is Here
I should mention that I strongly based my implementation on Tensorflow's iris data walkthrough
Thank you!
After a while, I was able to find the issues with your code they are in the order of importance. (First is of highest importance)
You are doing multi-class classification (not binary classification). Therefore your loss should be categorical_crossentropy
.
You are not onehot encoding your labels. Using binary_crossentropy
and having labels as a numerical ID is definitely not the way forward. Instead, you should do onehot encode your labels and solve this like a multi-class classification problem. Here's how you do that.
def pack_features_vector(features, labels):
"""Pack the features into a single array."""
features = tf.stack(list(features.values()), axis=1)
return features, tf.one_hot(tf.cast(labels-1, tf.int32), depth=4)
x = train_df[feature_names].values #returns a numpy array
min_max_scaler = preprocessing.StandardScaler()
x_scaled = min_max_scaler.fit_transform(x)
train_df = pd.DataFrame(x_scaled)
These issues should set your model straight.