Search code examples
mysqldatabasehierarchyhierarchical-data

mysql hierarchy storage with large trees


I don't know how to store my hierarchical data in my innoDB-table.

I've read a lot about the disadvantages of the method of storing the parent_id in each row. But now the problem is, that I have a very large database (~50 million rows). The hierarchy is mostly not very deep (3-6 levels).

Many websites advise taking the "Nested Set Model" as a better alternative to the parent-id-storing-method. But there are always changes being made (UPDATE, INSERT etc.) by the users of the website and because of the size of my table, this would take too much time (since changes in the "Nested Set Model" have a very low performance).

So my question is: How do you store efficiently large hierarchical data with many update/insert commands? (Also blocking the whole table is not an option [-> innoDB-table])


Solution

  • The Nested Sets design is definitely difficult when you need to make frequent updates to the tree. You end up having to renumber large parts of the tree.

    One suggestion for mitigating this is to use floating-point numbers instead of integers. If you insert a new node in the tree, it's relatively easy to find some FLOAT numbers in between the nested set numbers of the parent of the new node. You may eventually get to the limits of the precision of a floating-point number, but since your tree isn't very deep that won't happen for a long time.

    Another technique which I have written about I call Closure Table. This method of storing hierarchies makes it much easier to insert/update/delete nodes in a large tree without needing to update a lot of your tree. And you can still query the whole tree or any subtree in a single non-recursive SQL query.

    To read more about Closure Table, see:


    Re your comment:

    Adjacency List is simple, has a minimum of redundancy, and it supports FK relationships, which Nested Sets does not.  Adjacency List supports querying a whole tree of arbitrary depth if you use recursive queries. But MySQL does not support recursive queries.

    If you need to query only immediate parent-child relationships (i.e. one level of depth), or otherwise query only trees of fixed depth, then Adjacency List is fine.