Search code examples
mongodbrdbmsnosql

Is there any way to force a schema to be respected?


First, I'd like to say that I really love NoSQL & MongoDB but I've got some major concerns with its schema-less aspect.

Let's say I have 2 tables. Employees and Movies.

And... I have a very stupid data layer / framework that sometimes like to save objects in the wrong tables.

So one day, a Movie gets saved in the Employees table. Like this:

> use mongoTests;
switched to db mongoTests
> db.employees.insert({ name : "Max Power", sex : "Male" });
> db.employees.find();
{ "_id" : ObjectId("4fb25ce6420141116081ae57"), "name" : "Max Power", "sex" : "Male" }
> db.employees.insert({ title : "Fight Club", actors : [{ name : "Brad Pitt" }, { name : "Edward Norton" }]});
> db.employees.find();
{ "_id" : ObjectId("4fb25ce6420141116081ae57"), "name" : "Max Power", "sex" : "Male" }
{ "_id" : ObjectId("4fb25db834a31eb59101235b"), "title" : "Fight Club", "actors" : [ { "name" : "Brad Pitt" }, { "name" : "Edward Norton" } ] }

This is VERY wrong.

Let's switch the context, think about Movies, and CreditCards (for whatever reason, in this context credit cards would be stored in clear text inside the DB). This is SUPER WRONG?

  1. The code would probably explode because it's trying to use an object structure and receives another totally unknown structure.

  2. Even worst, the code actually works and the webstore visitors actually see credit cards information in the "Rent a movie" list.

Is there anything, built-in that would prevent such threat to ever happen? Like some way to "force" a schema to be respected for only some tables?

Or is there any way to force MongoDB to make a schema mandatory? (Can't create new fields in a table, etc)

EDIT: For those who thinks I'm trolling, I'm really not, this is an important question for me and my team because this is a big decision whether or not we're going to use NoSQL.

Thanks and have a nice day.


Solution

  • The schema-less aspect is one of the major positives.

    A DB with a schema doesn't fully remove this kind of issue - e.g. there could be a bug in a system that uses a RDBMS that puts the wrong data in the wrong field/table.

    IMHO, the bigger concern would be, how did that kind of bug make it through dev, testing and out into production?!

    Having said that, you could set up a process that checks the "schema" of documents within a collection (e.g. look at newly added documents, check whether they have fields you would expect to see in there) - then flag up for investigation. There is such a tool (node.js) here (I think, I've never used it):

    http://dhendo.github.com/node-mongodb-schema-validator/

    Edit:
    For those finding this question in future, so the link in my comment doesn't go overlooked, there's a jira item for this kind of thing here: http://jira.mongodb.org/browse/SERVER-3536