I had read articles in DataStax about Apache Cassandra and I had noticed that whatever the data we are going to write is going to distributed among all the nodes equally. Is it will be the case in all other distributed database management systems? Will other systems distribute the data among their databases equally? If they don't distribute equally then how the data is distributed among those distributed databases?
I had noticed that whatever the data we are going to write is going to distributed among all the nodes equally.
Not necessarily. The level of data duplication you have is determined by your replication factor, which is set on a per-keyspace basis. Let's say that I have a cluster of 3 nodes, and I define my keyspace like this:
CREATE KEYSPACE stackoverflow
WITH replication = {'class': 'NetworkTopologyStrategy', 'MyDC': '3'};
In this case "yes", my data will be equally replicated to each node. But let's say that I'm running out of disk space, and (as a start-up) I cannot afford to buy bigger hard drives. In that case, I may alter my keyspace to have a replication factor of 2 instead:
CREATE KEYSPACE stackoverflow
WITH replication = {'class': 'NetworkTopologyStrategy', 'MyDC': '2'};
This way, each node is only responsible for two-thirds of my data. Of course, the drawback here is that I can now only suffer the loss of a single node in my cluster.
Is it will be the case in all other distributed database management systems? Will other systems distribute the data among their databases equally?
Simply put, "no" and "no."
If they don't distribute equally then how the data is distributed among those distributed databases?
As there are hundreds of distributed DBMSs out there (including both NoSQL and RDBMSs which claim to be "distributed" in some manner), I cannot possibly begin to summarize (even generally) how they all distribute their data. But I will say that several of them utilize the concepts of a "shard key" and/or "secondary nodes" to achieve distribution and scale.
In Cassandra, all nodes are equal...there is no concept of a "master node." But some systems have the concept of a "primary" or "master" node, as well as "secondary" nodes. In these scenarios, the master handles all of the write operations, and replicates data to one or more secondaries. With a shard key, a certain range of the shard value is assigned to each node. Data is then stored only on the node(s) responsible for the range that the data's shard key falls into.