Search code examples
tensorflowkerasdata-preprocessing

Tensorflow Model PreProcess Time Data


I got information where and when a cab customer entered his vehicle. Now I want to predict in which street he wants to drive. My dataset is looking like this:

Example

Day, Hour, Minute, Entrance, Destination (Label)

Monday, 10, 45, ExampleStreet, StackOverflowCorner (Not PreProcessed)

0, 10, 45, 0, 1 (PreProcessed)

Converted like this:

Now I PreProcessed my Dataset like this:

Day -> Number from 0-6 (0 Monday, 1 Tuesday ...)

Hour -> European format from 0-24

Minute -> No preprocess

Entrance -> I used LabelEncoder (0 ExampleStreet, 1 ExampleCorner ...)

Destination -> Same like Entrance with Label Encoder

I got 98 possible destinations and the same amount of entrances and around 700 samples. I already used Tensorflow but only get a validation accuracy near 0.

model = keras.Sequential([     

tf.keras.layers.Dense(100, activation='relu'),
keras.layers.BatchNormalization(),
tf.keras.layers.Dropout(0.4),

tf.keras.layers.Dense(100, activation='relu'),

tf.keras.layers.Dropout(0.3),

tf.keras.layers.Dense(98,activation="softmax")
]) 
optimizer=keras.optimizers.RMSprop()
model.compile(optimizer=optimizer, loss=tf.keras.losses.sparse_categorical_crossentropy,     metrics=['accuracy'])

Questions

Did I PreProcess my data rightly? Do I need hot-encoding or gather more samples? Is another algorithm mabye more effective (Tree?)?

Thanks in advance...


Solution

  • You need one hot encoding of Entrance and day. And potentially - hour.

    You need more samples (number of samples should be close to the order of a number of variables for your model). But try with one-hot and see