Search code examples
database-designerlangschemacouchdbmnesia

How do you design a schema to efficiently query nested items in a key-value database?


I'm using Mnesia with Erlang, but this question applies to any key-value db like couchdb, etc.

I'm trying to break free of the RDBMS thought process, but I can't wrap my head around how to efficiently implement this kind of schema.

Say I have a User record, and he has many SubItemA records, which has many SubItem B records, so:

User
-SubItem A
--SubItem B
...

I need to run queries on SubItem B. Is it efficient to do it when it's this nested? Should I just normalize it so it will be quicker?

I have heard of some people using data duplication so the data is both nested and separate, is this ridiculous or is this actually useful in some cases?


Solution

  • The underlying question is, when is the performance good enough?

    Table-scanning the User dictionary isn't excessive overhead if you really do need to examine every SubItem B in detail and the size of the B's dominates the overall size of the dictionary.

    If that isn't good enough, normalize it so you can avoid reading in all the User and SubItem A data up front when you're querying SubItem B. Use a compound key such as (UserId, SubItemAId, SubItemBId) in the SubItem B dictionary if the table is ordered so you can do range queries.

    If that totally kills your User/SubItem A query performance, then consider data duplication as a last resort because it's more error-prone.