Search code examples
postgres-xl

Can you use Postgres-XL's Round Robbin or Hash data sharding options and have redundancy


The Postgres-XL 9.5 documentation says that when using the Round Robbin or Hashing data sharding options that each data element is only written to a single node. It does not give any other details beyond that.

Is data really not stored on more than one node? That seems highly failure prone and poorly considered if it is the case.

Is the replication mode really the only way to have data saved on more than one node? The replication option really does not seem feasible since it seems to be three times slower, and I assume must get slower as you add more nodes.


Solution

  • So I found my answer and am still shocked by it.

    "HA is not built-in, we have concentrated on the scaling side"

    So as it turns out if you are not using the REPLICATION Distribute by option and you lose a node you have lost the entire database. You can setup "stand by" nodes for each of your data nodes, but that doubles the number of nodes needed obviously, and even with that it will not fail over if a node goes down. You will still have to take down the entire database. Manually reconfigure it to use the stand by node for the failed one and restart it.

    Your only real way to have data protection is to use the REPLICATION mode which makes it MUCH slower, and gets slower still as you add more and more nodes. And also does not have fail over. You will have to manually remove the failed node and restart it.

    I am at a loss as to how anyone is supposed to use this in a large scale production environment.

    https://sourceforge.net/p/postgres-xl/mailman/message/32776225/

    https://sourceforge.net/p/postgres-xl/mailman/message/35456205/