Search code examples
indexingcouchdbb-tree

How does couchdb retrieve all previous revisions?


From what I understand, CouchDB's Btree implementation actually uses Shadowing technique, and every update will produce new root, the following excerpts from this PDF (it looks like implementing a better algorithm than traditional shadowing).

Shadowing means that to update an on-disk page, the entire page is read into memory, modified, and later written to disk at an alternate location. When a page is shadowed its location on disk changes, this creates a need to update (and shadow) the immediate ancestor of the page with the new address. Shadowing propagates up to the file system root.

How does couchdb implement fetching all leaf revisions as possible as it can( since some revisions are removed by compacting process)? Does couch internally store a pointer which points to previous revisions?

Thanks Chang


Solution

  • By good luck, CouchDB committer and community leader Adam Kocoloski explained this recently on the mailing list.

    Here is what he said:

    "Each leaf in the ID btree [stores] a revision tree containing pointers to all available revisions of a document. Retrieving an old revision (before compaction) or a conflicting version of a document requires exactly the same number of IOs as retrieving the current one."

    If I understand correctly, shadowing is not used to conceal old document revisions at all, but rather entire revision trees that are no longer meaningful.