Search code examples
databasealgorithmdatabase-designrdbmsbigtable

How do the newer database models achieve better scalability and performance as compared to a traditional RDBMS implementation?


We have

all aiming towards one common goal - making data management as scalable as possible.

By scalability what I understand is that the cost of the usage should not go up drastically when the size of data increases.

RDBMS's are slow when the amount of data is large as the number of indirections invariable increases leading to more IO's.

alt text

How do these custom scalable friendly data management systems solve the problem?

This is a figure from this document explaining Google BigTable:

alt text

Looks the same to me. How is the ultra-scalability achieved?


Solution

  • The "traditional" SQL DBMS market really means a very small number of products, which have traditionally targeted business applications in a corporate setting. Massive shared-nothing scalability has not historically been a priority for those products or their customers. So it is natural that alternative products have emerged to support internet scale database applications.

    This has nothing to do with the fact that these new products are not "Relational" DBMSs. The relational model can scale just as well as any other model. Arguably the relational model suits these types of massively scalable applications better than say, network (graph based) models. It's just that the SQL language has a lot of disadvantages and no-one has yet come up with suitable relational NOSQL (non-SQL) alternatives.