Why Deep Q networks algorithm performs only one gradient descent step?

Why dqn algorithm performs only one gradient descent step, i.e. trains for only one epoch? Would not it benefit from more epochs, won’t its accuracy improve with more epochs?

Solution

Time efficiency.

In theory, in the policy iteration / evaluation scheme, you should wait until convergence before moving to the next update. However, this can (a) never happen, (b) take too much. So people typically do one single step with a small learning rate in the hope that the critic (Q) is not "too wrong".

You could try more steps, but in general how many gradient steps to do is a design choice, and they probably found that this works the best.

plot a network based on given values
Adding a X axis title to faceted ggballoonplot
Calculate mean of matrices having different dimensions
check if two columns have a one-to-one relationship in R
How to extract Std.Dev from VarCorr glmmTMB
Determine level of nesting in R?
How do you print to stderr in R?
How to plot China map with South China Sea in base R
Get column and row position of nth element in a matrix
Is there any authoritative documentation on R release nicknames?
R Glassdoor Web Scraping
Issue with graticule across 180° for several country/territory EEZs
Separating grouped layers in a raster stack in terra
How can I use group_by and mutate to perform a subtraction calculation with specific groupings? Time 0 minus Time X for all groups
How to directly open .R data containing data frame code in R?
Way to web-scrape a popular eSport website using R?
Variance calculation warning: longer object length is not a multiple
gratia::draw(): "'length.out' must be a non-negative number"
Using Swift as custom engine in knitr and including all previous content
convert source target value dataframe into a correlation matrix
ggplot2 plotting a 100% stacked area chart
Use string as formula for ipwtm function?
interpolarization within groups with NA
Multi-row x-axis labels in ggplot line chart
How to do a SOAP request for EUR-Lex API with R?
Make an alluvial plot
Parameters for the ggplot theme function about legend.axis.line
Error handling for tidyr hoist in API call dplyr pipe when column type changes between calls
calculate distance between regression line and datapoint
Colour picker input not updating output in R Shiny