Search code examples
mysqldatabasenosqlrelational-databaseberkeley-db

Using NoSQL database for relational purpose


Non-relational databases are attracting more attention day by day. The main limitation is that today's complicated data are indeed connected. Isn't it convenient to connect databases as we connect tables in RDBMS? Of course, I just mean simple cases. Imagine three tables of Articles, Tags, Relationships. In a RDBMS like Mysql, we can run three queries to

1. Find ID of a given tag
2. Find Articles connected with the captured Tag ID
3. Fetch the contents of Articles tagged with the term

Instead of three queries, we perform a single query by JOIN. I think three queries in a key/value database like BerkeleyDB is faster than a JOIN query in Mysql.

Is this idea practical? Or other issues are involved to ignore this approach?


Solution

  • NoSQL databases can support relational data models just fine. You're just left to implement the relational mapping yourself in your application, and that effort is typically not insignificant.

    In some applications this extra effort will be worthwhile. Perhaps you only have a small number of tables and the joins you need are very simple. Or perhaps you've done some performance evaluation between a traditional relational DBMS and a NoSQL alternative and found that the NoSQL option is more appropriate for your needs for any number of reasons (performance, scalability, flexibility, whatever).

    You should keep one thing in mind, however. A typical SQL DBMS is basically a NoSQL DB with an optimized, well-built relational engine in front of it. Some databases even let you bypass the relational layer and treat their system like a pure NoSQL DB.

    Therefore, the moment you start to build your own relational mappings and joins on top of a NoSQL DB you should ask yourself, "Didn't someone build this for me already?" The answer may well be "yes", and the solution might be to go with a traditional SQL DBMS.

    To answer the "3 query" part of your question specifically, the answer is "maybe". You certainly might be able to make such a query run faster in a NoSQL DB than in an RDBMS, but you need to keep in mind that there are more things to consider here than just the raw speed of your query:

    1. The technical debt you will incur as you build join-like functionality that you wouldn't have had to build otherwise
    2. The time it will take you to build, test and optimize your query code which will likely be more significant than writing a simple SQL query
    3. Any difference in transactional guarantees or other typical product features (replication, management tools, etc) which you may lose or gain depending on the NoSQL option you choose
    4. The ability to hire DBMs who know how to run your database from an operational perspective

    You might review that list and say to yourself, "No big deal, I'm running a simple app with only a few thousand DB entries and I'll maintain it myself". If so, knock yourself out - Berkeley (and other NoSQL options) would work fine. I've used Berkeley many times for those kinds of applications. But you may have a different answer if you are building the back-end for a significantly-sized SaaS product which might soon have millions of users and very complex queries.

    We can't give a one-size-fits-all answer, unfortunately. You'll have to make the judgement call yourself based on the needs of you application and organization.