Search code examples
pythonriak

Riak python, how to properly connect to a server pool?


If my Riak cluster is filled with 5 machines (riak1, riak2...) , should I create a RiakClient with a single host or the complete list of 5 machines in order to achieve redundancy, is there automated discovery of node members ?

RiakClient(protocol='http', host='riak1', http_port=8098)

or

RiakClient(protocol='http', nodes=[{
  host:'riak1', 
  host:'riak2',  
  host:'riak3', 
  host:'riak4', 
  host:'riak5'
}])

Is there any alternative to this, such as a load balancer host in front of the riak cluster nodes ?


Solution

  • Is there any particular reason you are using HTTP and not protocol buffers?

    In any case, nodes must be "a list of node configurations, where each configuration is a dict containing the keys 'host', 'http_port', and 'pb_port'" (http://basho.github.io/riak-python-client/client.html).

    That is:

    RiakClient(protocol='http', nodes=[
        {'host': 'riak1', 'http_port': 8098}, 
        {'host': 'riak2', 'http_port': 8098},
        {'host': 'riak3', 'http_port': 8098}, 
        {'host': 'riak4', 'http_port': 8098},
        {'host': 'riak5', 'http_port': 8098}])
    

    Yes, you may use a load balancer in front of a Riak cluster. Actually, it is a good idea as your client will connect to a single host no matter how many nodes there are in the Riak cluster. Adding/removing/replacing nodes will not affect clients.

    If you choose not to, however, you have to pass the complete list of Riak nodes explicitly, for redundancy and load balancing on the client side ("a random node is selected when a new connection is requested").