Two nodes cluster, 2i queries impossible when one node down?

Jared Morrow jared at basho.com
Fri Mar 30 14:15:16 EDT 2012


Cedric,

The problem you are seeing is that the list of vnodes does not spread perfectly between physical nodes.   So with N=2 on only a 2 node cluster, you aren't given assurances that those two copies of data will end up on two vnodes owned by different physical nodes.   So if you have data X on vnode 15 and 18 pretend, those two vnodes might both be owned by  192.168.0.21.

If you are running less (or equal) nodes than your N value, all bets are off when it comes to testing riak functionality.   We suggest that you have a physical node count   > N +1 to see Riak's true capabilities.   If you just want to test functionality on a small scale, you can use the 'make devrel' setup to run 4 nodes on a single machine for testing [1].   Please do not test speed or throughput in that setup, but if you are trying to see how 2i works, that's a better option than two nodes with N=2.   


[1] http://wiki.basho.com/Building-a-Development-Environment.html

Hope that helps,
Jared






On Mar 30, 2012, at 11:49 AM, Cedric Maion wrote:

> Hey there,
> 
> I'm playing with a two node riak 1.1.1 cluster, but can't figure how to
> make riak happy with only one node (and the other one failed).
> 
> My use case is the following:
> - HTML pages gets stored in the riak cluster (it's used as a page cache)
> - I'm using secondary indexes to tag those documents
> - When I need to invalidate some HTML pages, I make a query on those
> secondary indexes to retrieve the list of keys that needs to be deleted
> (documents matching a specific tag value)
> 
> The bucket is configured with
> {"props":{"n_val":2,"allow_mult":false,"r":1,"w":1,"dw":0,"rw":1}}.
> 
> Everything is working fine when both node are up and running.
> However, if I turn one node off, secondary indexes queries returns
> {error,{error,insufficient_vnodes_available}} errors.
> 
> I can't find a way to have the cluster converge to a stable state on only one node.
> 
> 
> (192.168.0.21 is the node that has been turned off, by just stopping riak with /etc/init.d/riak stop)
> 
> 
> root at 192.168.0.51:~# riak-admin member_status
> Attempting to restart script through sudo -u riak
> ================================= Membership==================================
> Status     Ring    Pending    Node
> -------------------------------------------------------------------------------
> valid      50.0%      --      'riak at 192.168.0.21'
> valid      50.0%      --      'riak at 192.168.0.51'
> -------------------------------------------------------------------------------
> Valid:2 / Leaving:0 / Exiting:0 / Joining:0 / Down:0
> 
> 
> => 192.168.0.21 is not seen as being down, not sure if it's an issue or not
> 
> 
> 
> root at 192.168.0.51:~# riak-admin transfers
> Attempting to restart script through sudo -u riak
> Nodes ['riak at 192.168.0.21'] are currently down.
> 'riak at 192.168.0.51' waiting to handoff 5 partitions
> 
> 
> => by creating/reading many keys, I finally get "waiting to handoff 32 partitions", which seems OK to me (ring size is 64, so each node should normally own 32).
> => however, secondary indexes queries always fails, until I turn the failed node ON again.
> 
> 
> I tried to force "riak-admin down riak at 192.168.0.21" from the valid node, but no luck either.
> 
> Not being able to use secondary indexes while a node is down is a real problem.
> Is this expected behavior, or what am I missing?
> 
> 
> Thanks in advance!
> Kind regards,
> 
>     Cedric
> 
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20120330/535d97c2/attachment.html>


More information about the riak-users mailing list