Search code examples
tensorflowmachine-learningkeras

# of Units for Dense Layer in TensorFlow


In Tensorflow models, how are the number of dense units chosen? For example, there's 32 and 512 here, respectively.

model = Sequential()
model.add(Dense(32, input_shape=(16,)))
model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dense(512, activation=tf.nn.relu),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10, activation=tf.nn.softmax)
])

Solution

  • Those are called hyperparameters and should be tuned on a validation/test set to tweak your model to get an higher accuracy.

    Tuning just means trying different combinations of parameters and keep the one with the lowest loss value or better accuracy on the validation set, depending on the problem.

    There are two basic methods:

    Grid search: For each parameter, decide a range and steps into that range, like 8 to 64 neurons, in powers of two (8, 16, 32, 64), and try each combination of the parameters. This is obviously requires an exponential number of models to be trained and tested and takes a lot of time.

    Random search: Do the same but just define a range for each parameter and try a random set of parameters, drawn from an uniform distribution over each range. You can try as many parameters sets you want, for as how long you can. This is just a informed random guess.