If my Riak cluster is filled with 5 machines (riak1, riak2...) , should I create a RiakClient
with a single host or the complete list of 5 machines in order to achieve redundancy, is there automated discovery of node members ?
RiakClient(protocol='http', host='riak1', http_port=8098)
or
RiakClient(protocol='http', nodes=[{
host:'riak1',
host:'riak2',
host:'riak3',
host:'riak4',
host:'riak5'
}])
Is there any alternative to this, such as a load balancer host in front of the riak cluster nodes ?
Is there any particular reason you are using HTTP and not protocol buffers?
In any case, nodes
must be "a list of node configurations, where each configuration is a dict containing the keys 'host', 'http_port', and 'pb_port'" (http://basho.github.io/riak-python-client/client.html).
That is:
RiakClient(protocol='http', nodes=[
{'host': 'riak1', 'http_port': 8098},
{'host': 'riak2', 'http_port': 8098},
{'host': 'riak3', 'http_port': 8098},
{'host': 'riak4', 'http_port': 8098},
{'host': 'riak5', 'http_port': 8098}])
Yes, you may use a load balancer in front of a Riak cluster. Actually, it is a good idea as your client will connect to a single host no matter how many nodes there are in the Riak cluster. Adding/removing/replacing nodes will not affect clients.
If you choose not to, however, you have to pass the complete list of Riak nodes explicitly, for redundancy and load balancing on the client side ("a random node is selected when a new connection is requested").