Two nodes cluster, 2i queries impossible when one node down?

Cedric Maion cedric at maion.com
Fri Mar 30 13:49:38 EDT 2012


Hey there,

I'm playing with a two node riak 1.1.1 cluster, but can't figure how to
make riak happy with only one node (and the other one failed).

My use case is the following:
- HTML pages gets stored in the riak cluster (it's used as a page cache)
- I'm using secondary indexes to tag those documents
- When I need to invalidate some HTML pages, I make a query on those
secondary indexes to retrieve the list of keys that needs to be deleted
(documents matching a specific tag value)

The bucket is configured with
{"props":{"n_val":2,"allow_mult":false,"r":1,"w":1,"dw":0,"rw":1}}.

Everything is working fine when both node are up and running.
However, if I turn one node off, secondary indexes queries returns
{error,{error,insufficient_vnodes_available}} errors.

I can't find a way to have the cluster converge to a stable state on only one node.


(192.168.0.21 is the node that has been turned off, by just stopping riak with /etc/init.d/riak stop)


root at 192.168.0.51:~# riak-admin member_status
Attempting to restart script through sudo -u riak
================================= Membership==================================
Status     Ring    Pending    Node
-------------------------------------------------------------------------------
valid      50.0%      --      'riak at 192.168.0.21'
valid      50.0%      --      'riak at 192.168.0.51'
-------------------------------------------------------------------------------
Valid:2 / Leaving:0 / Exiting:0 / Joining:0 / Down:0


=> 192.168.0.21 is not seen as being down, not sure if it's an issue or not



root at 192.168.0.51:~# riak-admin transfers
Attempting to restart script through sudo -u riak
Nodes ['riak at 192.168.0.21'] are currently down.
'riak at 192.168.0.51' waiting to handoff 5 partitions


=> by creating/reading many keys, I finally get "waiting to handoff 32 partitions", which seems OK to me (ring size is 64, so each node should normally own 32).
=> however, secondary indexes queries always fails, until I turn the failed node ON again.


I tried to force "riak-admin down riak at 192.168.0.21" from the valid node, but no luck either.

Not being able to use secondary indexes while a node is down is a real problem.
Is this expected behavior, or what am I missing?


Thanks in advance!
Kind regards,

    Cedric


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20120330/8d9eb677/attachment.html>


More information about the riak-users mailing list