Search code examples
sqldatabase-designtreenested-setsanti-patterns

Is this a typo in the "SQL Anti-Patterns" book?


Relevant section:

Nested Sets.

The Nested Sets solution stores information with each node that pertains to the set of its descendants, rather than the node's immediate parent. This information can be represented by encoding each node in the tree with two numbers, which you can call nsleft and nsright.

CREATE TABLE Comments ( 
   comment_id   SERIAL PRIMARY KEY, 
   nsleft       INTEGER NOT NULL,
   nsright      INTEGER NOT NULL,
   bug_id       BIGINT UNSIGNED NOT NULL,
   author       BIGINT UNISGNED NOT NULL,
   comment_date DATETIME NOT NULL,
   comment      TEXT NOT NULL,
   FOREIGN KEY  (bug_id) REFERENCES Bugs (bud_id),
   FOREIGN KEY (author) REFERENCES Accounts(account_id)
);

Each node is given nsleft and nsright numbers in the following way: the nsleft number is less than the numbers of all the node's children, whereas the right number is greater than the numbers of all the node's children. These numbers have no relation to the comment_id values.

(pg. 32 of SQL Anti-Patterns)

Shouldn't it say

the nsleft number is less than the numbers of all the node's descendants, whereas the right number is greater than the numbers of all the node's descendants.

or is my understanding of this concept flawed?


Solution

  • I think it isn't a typo, but you're also correct that it applies to the descendants, so you understand the situation — you just haven't quite thought it all the way through.

    Suppose the tree has 3 levels. For each node at the lowest, leaf level, there are no children. At the intermediate level, each node has an nsleft smaller than the smallest nsleft of any of its children (the leaf nodes) and an nsright larger than the largest nsright of any of its children. Similarly, at the top level, the single root node has an nsleft smaller than the smallest nsleft of any of its direct children, and an nsright larger than the largest nsright of any of its direct children. But, by induction, that means that the root node's nsleft node is smaller than the smallest nsleft of any of its descendents, and its nsright node is larger than largest nsright of any of its descendents.

    So, the statement is accurate as written, but your inference is also correct.