algorithm dynamic-programming theory backtracking memoization

Is dynamic programming backtracking with cache

I've always wondered about this. And no books state this explicitly.

Backtracking is exploring all possibilities until we figure out one possibility cannot lead us to a possible solution, in that case we drop it.

Dynamic programming as I understand it is characterized by overlapping sub-problems. So, can dynamic programming can be stated as backtracking with cache (for previously explored paths) ?

Thanks

Solution

This is one face of dynamic programming, but there's more to it.

For a trivial example, take Fibonacci numbers:

F (n) =
        n = 0:  0
        n = 1:  1
        else:   F (n - 2) + F (n - 1)

We can call the above code "backtracking" or "recursion". Let us transform it into "backtracking with cache" or "recursion with memoization":

F (n) =
       n in Fcache:  Fcache[n]
       n = 0:  0, and cache it as Fcache[0]
       n = 1:  1, and cache it as Fcache[1]
       else:  F (n - 2) + F (n - 1), and cache it as Fcache[n]

Still, there is more to it.

If a problem can be solved by dynamic programming, there is a directed acyclic graph of states and dependencies between them. There is a state that interests us. There are also base states for which we know the answer right away.

We can traverse that graph from the vertex that interests us to all its dependencies, from them to all their dependencies in turn, etc., stopping to branch further at the base states. This can be done via recursion.
A directed acyclic graph can be viewed as a partial order on vertices. We can topologically sort that graph and visit the vertices in sorted order. Additionally, you can find some simple total order which is consistent with your partial order.

Also note that we can often observe some structure on states. For example, the states can be often expressed as integers or tuples of integers. So, instead of using generic caching techniques (e.g., associative arrays to store state->value pairs), we may be able to preallocate a regular array which is easier and faster to use.

Back to our Fibonacci example, the partial order relation is just that state n >= 2 depends on states n - 1 and n - 2. The base states are n = 0 and n = 1. A simple total order consistent with this order relation is the natural order: 0, 1, 2, .... Here is what we start with:

Preallocate array F with indices 0 to n, inclusive
F[0] = 0
F[1] = 1

Fine, we have the order in which to visit the states. Now, what's a "visit"? There are again two possibilities:

(1) "Backward DP": When we visit a state u, we look at all its dependencies v and calculate the answer for that state u:

for u = 2, 3, ..., n:
    F[u] = F[u - 1] + F[u - 2]

(2) "Forward DP": When we visit a state u, we look at all states v that depend on it and account for u in each of these states v:

for u = 1, 2, 3, ..., n - 1:
    add F[u] to F[u + 1]
    add F[u] to F[u + 2]

Note that in the former case, we still use the formula for Fibonacci numbers directly. However, in the latter case, the imperative code cannot be readily expressed by a mathematical formula. Still, in some problems, the "forward DP" approach is more intuitive (no good example for now; anyone willing to contribute it?).

One more use of dynamic programming which is hard to express as backtracking is the following: Dijkstra's algorithm can be considered DP, too. In the algorithm, we construct the optimal paths tree by adding vertices to it. When we add a vertex, we use the fact that the whole path to it - except the very last edge in the path - is already known to be optimal. So, we actually use an optimal solution to a subproblem - which is exactly the thing we do in DP. Still, the order in which we add vertices to the tree is not known in advance.