I got information where and when a cab customer entered his vehicle. Now I want to predict in which street he wants to drive. My dataset is looking like this:
Day, Hour, Minute, Entrance, Destination (Label)
Monday, 10, 45, ExampleStreet, StackOverflowCorner (Not PreProcessed)
0, 10, 45, 0, 1 (PreProcessed)
Now I PreProcessed my Dataset like this:
Day -> Number from 0-6 (0 Monday, 1 Tuesday ...)
Hour -> European format from 0-24
Minute -> No preprocess
Entrance -> I used LabelEncoder (0 ExampleStreet, 1 ExampleCorner ...)
Destination -> Same like Entrance with Label Encoder
I got 98 possible destinations and the same amount of entrances and around 700 samples. I already used Tensorflow but only get a validation accuracy near 0.
model = keras.Sequential([
tf.keras.layers.Dense(100, activation='relu'),
keras.layers.BatchNormalization(),
tf.keras.layers.Dropout(0.4),
tf.keras.layers.Dense(100, activation='relu'),
tf.keras.layers.Dropout(0.3),
tf.keras.layers.Dense(98,activation="softmax")
])
optimizer=keras.optimizers.RMSprop()
model.compile(optimizer=optimizer, loss=tf.keras.losses.sparse_categorical_crossentropy, metrics=['accuracy'])
Did I PreProcess my data rightly? Do I need hot-encoding or gather more samples? Is another algorithm mabye more effective (Tree?)?
Thanks in advance...
You need one hot encoding of Entrance and day. And potentially - hour.
You need more samples (number of samples should be close to the order of a number of variables for your model). But try with one-hot and see