Search code examples
mysqlpostgresqlscalabilitylarge-datanosql

What are the differences between different Database solutions like mysql, Nosql, Cassandra, Mongodb, postgresql and when would you use each?


I hope this is a legitimate question.

I have a very large data-set (for me). I have a 639 mb table with over 8 million rows. I will be mainly reading this data and the data should be essentially persistent (it wouldn't ever really change).

Upon realizing I have over 8 million rows I began to wonder if the mySql solution I started with would still be optimal. This got me looking at Nosql and the different subsets of it (cassandra, mongodb, postgresql) These are all subsets of nosql, right?

So now after lots of searching through guides on google and watching a few presentations and reading a few powerpoints I'm basically just wondering if things like cassandra and mongodb are essentially the same. If the sql alternatives are basically all nosql. When is the dataset so to big that a nosql solution becomes more optimal than the traditional RDBMS solution? Other than just large data-sets are there any other reasons really to necessarily use nosql alternatives (other than for performance reasons)? And generally I'm just wondering what sql alternatives are optimal for large data-sets and scalability, what qualifies a large dataset and what are the leading industry standards in dealing with these large data-sets?

I'm really interested in what DBAs might have to say about this as well as web developers. Thank-you so much for any helpful tidbits of information, I really appreciate it (even if you're just pointing me towards a resource).

EDIT: This question is on hold because "Many good questions generate some degree of opinion based on expert experience, but answers to this question will tend to be almost entirely based on opinions, rather than facts, references, or specific expertise." I understand where that's coming from. My hope though was to get some insight into what might be the industry standard. Like maybe people will disagree and nitpick on which type of DB to use in this specific instance, but surely there are well known standards which if met will qualify the use of either mysql or nosql. And just as likely there are sub-standards that would qualify the use of either cassandra or mongodb. I was hoping someone with years of experience in the field could either chime in or point me to a resource I could use to have a better grasp in distinguishing between these. I understand if this isn't possible but I hope it is. Cheers, Stephen.


Solution

  • 8 millions of rows and 639 mb table isnt something special for most RDBMS. It might require some tuning or indexing but its not really hard.

    You should chose a DB based on the structure of the table in question. If it is a 'real' table (the data in it can be represented in table format) - then any RDBMS should fit for this case.