Search code examples
deep-learningactionreinforcement-learningq-learning

How to select the action with highest Q value


I have implemented DQN with experience replay.Input is 50x50x1. With a batch size of 4, input would become (4,50,50,1). Total output actions are 10. If batch size is 4, output would be (4,10). I want to know how would i select the max q-value out of this (4,10) vector. Thanks in advance


Solution

  • This is probably what you're looking for tf.math.reduce_max.

    X_max = tf.reduce_max(X)
    

    This returns a single maximum value from a given tensor X.

    In the context of DQN, with a batch size of 4 (4 rows), you would want to select 4 maximum Q values, one for each row. You can do this with the following:

    X_max = tf.reduce_max(X, axis=1)
    

    Where X is your data structure containing the Q values with shape (4,10). This returns 4 maximum Q values in a single tensor X_max with output shape (4,1).