deep-learning action reinforcement-learning q-learning

How to select the action with highest Q value

I have implemented DQN with experience replay.Input is 50x50x1. With a batch size of 4, input would become (4,50,50,1). Total output actions are 10. If batch size is 4, output would be (4,10). I want to know how would i select the max q-value out of this (4,10) vector. Thanks in advance

Solution

This is probably what you're looking for tf.math.reduce_max.

X_max = tf.reduce_max(X)

This returns a single maximum value from a given tensor X.

In the context of DQN, with a batch size of 4 (4 rows), you would want to select 4 maximum Q values, one for each row. You can do this with the following:

X_max = tf.reduce_max(X, axis=1)

Where X is your data structure containing the Q values with shape (4,10). This returns 4 maximum Q values in a single tensor X_max with output shape (4,1).

Traceback (most recent call last) in Colab when looping through dataloader in pytorch
The “Forward/Backward Passage Size” is too large for the pytorch model (Yolov3)
How do I use distributed DNN training in TensorFlow?
Neural network learning to sum two numbers
Implementation of F1-score, IOU and Dice Score
use matplotlib_inline and torch、d2l show error :NotImplementedError: Implement enable_gui in a subclass
how to implement custom metric in keras?
torchrl: Using SyncDataCollector with a custom pytorch dqn
Does peft train newly initialized weights?
Do I have to write custom AutoModel transformers class in case "TypeError: NVEmbedModel.forward() got an unexpected keyword argument 'inputs_embeds'"
Why RAG is slower than LLM?
"RuntimeError: Numpy is not available" when using inverse_transform
Pytorch RuntimeError: "host_softmax" not implemented for 'torch.cuda.LongTensor'
AMD ROCm with Pytorch on Navi10 (RX 5700 / RX 5700 XT)
Can we use multiple loss functions in same layer?
How do I update pixelClassificationLayer() to a custom loss function?
Neuralnet RMSE is 10x bigger than linear model's RMSE on test data set
Back Propagation in Convolutional Neural Networks and how to update filters
Face alignment megaface
autoencoder.fit() raises 'KeyError: 'Exception encountered when calling Functional.call()'
When to use numpy.random.randn(...) and when numpy.random.rand(...)?
What is freezing/unfreezing a layer in neural networks?
How can I use a pre-trained neural network with grayscale images?
PyTorch RuntimeError: device >= 0 && device < num_gpus INTERNAL ASSERT FAILED
How do I initialize weights in PyTorch?
Does one convolutional filter always have different coefficients for each of the channels of the previous layer?
Obtain the output of intermediate layer (Functional API) and use it in SubClassed API
Optuna Hyperband Algorithm Not Following Expected Model Training Scheme
Broadcasting multiple versions of X_data that pair with the same y_data
How to make TensorFlow use 100% of GPU?