full cluster failure

Gil Teixeira gil at pixelbeam.org
Tue Jun 12 04:28:21 EDT 2012


Hellow, I am faily new to RIAK, and am experiencing a very peculiar problem. 
We are conducting a RIAK evaluation to see if it fits our purposes, because on paper it seams to be a perfect fit.

To create an evaluating scenario i have setup 5 nodes under vmware(the network is bridged)

The cluster runs just fine, and i can down nodes for maintenance as expected perfectly, while the cluster keeps serving the data as expected. Node reboots are also tolerated well by the cluster as expected.

But if a note unexpectedly fails (hard power off or sudden network disconnection) all nodes get exited and the full cluster becomes inaccessible until the failed node is back up or until i manually mark the failed node as down with risk-admin.

I was expecting a cluster (n3) with 5 nodes to simply tolerate 1 node failure transparently. 

Is there something i may be doing wrong? Is this the expected behavior?

I would appreciate any light anyone could shine on this subject.

Thank you,

More information about the riak-users mailing list