In every formalism of GTD(λ) seems to define it in terms of function approximation, using θ and some weight vector w.
I understand that the need for gradient methods widely came from their convergence properties for linear function approximators, but I would like to make use of GTD for the importance sampling.
Is it possible to take advantage of GTD without function approximation? If so, how are the update equations formalized?
I understand that when you say "without function approximation" you mean representing the value function V as a table. In that case, the tabular representation of V can also be seen as a function approximator.
For example, if we define the approximated value function as:
Then, using a tabular representation, there are as many features as states, and the feature vector for a given state s is zero for all states except s (that it's equal to one), and the parameter vector theta stores the value for each state. Therefore, GTD, as well as others algorithms, can be used without any modification in a tabular way.