Search code examples
databaseormnosqldocument-databaseodm

Motivations for using relational database / ORM or document database / ODM


It's been a long time since I haven't created a project from scratch, and now document-oriented databases (as well as ODM) have become quite popular, so I have to consider them before blindly going the relational route.

Could anyone try to list the motivations / project criteria that could lead to one choice or another?


Solution

  • ORM / relational database / SQL

    Pros:

    • well-understood, standard approach
    • maps well to data with consistent structure
    • maps well to data with multiple relationships between multiple entities
    • has extensive join capabilities
    • has transactions
    • scalable to huge number of transactions per second (using MySQL Cluster, Fusion-IO, etc)

    Cons:

    • hard to scale to huge volume of data, if performance is also a concern
    • does not map well to data with variable structure (or semi-structured)
    • persisting objects requires a glue/translation layer which can be a performance bottleneck (and if done wrong also can be very verbose)

    ODM / document database / NoSQL

    Pros:

    • scalable to huge volume of data, and to huge numbers of relatively independent queries
    • high availability, sharding, multi-master, ...
    • maps well to semi-structured data
    • maps well to data with more variable structure
    • data model can be more flexible
    • queries do not have to be translated to SQL (the native NoSQL query style may or may not be better suited for some uses, and no overhead from SQL driver/parsing/etc)
    • (for object databases) maps directly to objects, no need for object-relational translation

    Cons:

    • often, no join (or limited version of join)
    • often, no transactions (or a limited version of transactional consistency/atomicity)

    How to decide

    Based on the type of data and usage patterns:

    • Does the data have a uniform structure? (relational) ... or a variable/inconsistent structure? (document)
    • Does typical usage read/write a single type of entity? (document) ... or a view which is composed of properties of multiple entities? (relational)
    • Are transactions required? (relational) ... or no transactions needed? (document)

    Based on the scaling/performance requirements:

    • Huge data + few, slow, complex read/writes? (data warehousing type scenario) => relational
    • Huge data + huge volume of simple read/writes? (craigslist backend type scenario) => document
    • Huge data + fast, complex reads/writes? => this is difficult; either use relational and try to scale it up, or use document and try to simplify queries
    • Moderate data + fast transactional writes? (banking type scenario) => relational
    • Moderate data + moderate read/write? => pick any based on vendor/tool support, familiarity, etc

    References

    (Background: I haven't done anything in this area very recently, but a few years ago I built a large system which used replicated MySQL + Sphinx, ie a relational and document hybrid)