Search code examples
cassandracouchbaseinfinispannosql

noSQL rollback feature


I'm new to noSQL technologies, and I was surprised there is no transaction support whatsoever. My main problem is when i make some of our insert task, that insert consist of ~5 seperate insert. We have to find a document by 4 different IDs. The problem is that the document is fairly large, and it's really expensive to store it like this:

  • key | value


  • user1 HugeDoc1

  • user2 HugeDoc1
  • user3 HugeDoc1

So we come up with an internal Id, that points to the document. Yes, I know this design somewhat violates the whole noSQL concept, but it saves a lot of memory. If document insert fails, the Ids have no meaning, and should be removed. Is it a good idea to write my own rollback handling, keep track of successful inserts/updates? Or the whole concept is wrong?


Solution

  • I know this design somewhat violates the whole noSQL concept, but it saves a lot of memory.

    That is a very 1970's way of thinking. Relational database theory originated at a time when disk space was expensive. In 1975 IBM was selling hard drives at $11k per megabyte. By 1980 prices dropped so that you could buy a gigabyte's worth of storage space for under $1 million. Today, you can go on NewEgg and buy a terabyte drive for $60. Now disk space is cheap, and processing time is the expensive part.

    In non-relational (NoSQL) data modeling, you should build your table structures according to how it makes sense to query your data. This is a departure from relational data modeling, where you build your tables according to how it makes sense to store your data. Often times, query-based modeling results in storage of redundant data...and that's ok. Duplicate data for speed, reference data for integrity.

    Is it a good idea to write my own rollback handling, keep track of successful inserts/updates? Or the whole concept is wrong?

    I was on a Cassandra project where we did implement something similar to an application-side transaction/rollback. It really didn't work very well, and ended-up creating several tombstones. Ultimately, I would ask yourself exactly why your application needs a non-relational database, because it sounds like you still need some of the benefits of a relational database. If you're sure that you absolutely need a non-relational database, then you may want to re-think your approach to data modeling.