I'm surprised at how many commenters on my previous post (here and at Reddit and HN) think it's a mistake to identify dynamic programming with memoization. I thought this was uncontroversial: that the central idea of DP is simply to make naive recursive algorithms efficient by memoizing. Evidently a lot of readers think otherwise.

I think they've been misled by the traditional presentation of DP, which bundles a lot of optimizations with that central idea. The recipe usually goes something like this:

- Generate a naive recursive algorithm.
- Memoize it by saving partial results in an array.
- Invert control, so the table of results is explicity assembled bottom-up instead of as a byproduct of top-down recursion.
- Analyze the dependencies among the partial results, and if possible, simplify the saved results to avoid storing unnecessary data.

Notice something wrong here? All the action happens in the first two steps. Well, semantically all the action happens in the first step; even the memoization is only an optimization. But what an optimization! It's what makes impractical algorithms practical; the remaining steps don't change much. The inversion of control usually contributes nothing, so I'm puzzled that some commenters describe it as the distinctive feature of DP; it looks like an artifact of the imperative past to me. Simplifying the results is valuable when it works (it can change the space complexity), but it only applies to some problems; usually it yields little or nothing. Memoization is the important part; the rest is optional.

Many programmers learn memoization only as a part of this recipe, and therefore call it by the same name. This is why, when people speak of “a dynamic programming algorithm”, they often mean a memoized one — the process has given its name to its main technique. This isn't very confusing; I wouldn't deprecate this common usage on this account. But I do think it's unfortunate that many programmers know the simple concept of memoization only as part of a larger, less coherent bundle.

What if the concepts were taught in a different order? Imagine you had learned memoization of recursive algorithms as an independent concept, before you had ever heard of dynamic programming. When you later meet the rest of the dynamic programming package — the iterativizing, the compact representations of partial results, the analysis of dependencies — what would you think of it? Would you see it as forming a single coherent idea, or as a collection of *optimizations to memoized recursion* — potentially useful, sure, but not intrinsic to the concept?

It would be the latter, of course: that's how you understand other complex groups of concepts. Take garbage collection, for example. It can fairly be described as just “finding the live objects by tracing the object graph”. There is much more to it, of course — many algorithms, some of which don't look much like tracing, and many optimizations and implementation tricks. But we don't define garbage collection as necessarily including all of these. We define it by its central concept, and treat the others as optional elaborations on that.

We should treat the concepts of dynamic programming the same way. Rather than try to include them all in one package, and refer to them only as a whole, we should let the central idea stand on its own, without reference to its optimizations. When you make a naive recursive algorithm practical by memoizing it, there's no need to describe the result as a “dynamic programming algorithm”; “memoized” already identifies the technique. Why use the vaguer term when you don't have to?

Some people see DP differently: not as a method of *implementing* algorithms but as a method of *discovering* them. To them, it's about analyzing dependencies in recursive algorithms, because this can lead to discovering a simpler or more efficient algorithm. The other concepts, like memoization, are optional. (If I understand correctly, from this point of view memoization is simply the general case of saving results blindly, regardless of how they're used — a useful tool, but not conceptually important.) This is a reasonable concept, but it's not the one I'm talking about; I'm more interested in implementing algorithms than discovering them. Maybe there's a divide here between theorists, who care more about discovery, and practitioners, who care more about handy tools. I'm on the practical side, so I see the algorithm discovery method as less interesting, because it's less often useful, than a simple tool for making naive algorithms efficient.

In an attempt to forestall the most obvious misunderstanding: I do not claim “dynamic programming”, as commonly used, refers only to memoization; it often refers to a recipe which happens to include memoization as its main ingredient. But that main ingredient is much more important than the rest of the recipe. We should give our attention not to the garnishes, but to the simple base on which the rest is optimizations.

Funny enough, this helped me grasp the concept even more! Thanks!

ReplyDelete