Whole cluster times out if one node is gone

David Smith dizzyd at basho.com
Mon Nov 29 07:44:25 EST 2010


On Tue, Nov 23, 2010 at 3:33 PM, Jay Adkisson <j4yferd at gmail.com> wrote:
> (many profuse apologies to Dan - hit "reply" instead of "reply all")
> Alrighty, I've done a little more digging.  When I throttle the writes
> heavily (2/sec) and set R and W to 1 all around, the cluster works just fine
> after I restart the node for about 15-20 seconds.  Then the read request
> hangs for about a minute, until node D disappears from connected_nodes in
> riak-admin status, at which point it returns the desired value (although
> sometimes I get a 503):

Are you seeing any error messages in log/erlang.log.* or log/sasl-error.log?

Can you expound on your use case a little -- are you doing a large
insert, or just a random read/write mix? Did you pre-populate the
dataset? Why are you using r=1, instead of relying on quorom for
reads?

How are you running the riak-admin status to measure the 15-20 seconds?

Thanks.

D.




More information about the riak-users mailing list