Riak replication and quorum

Peter Fales Peter.Fales at alcatel-lucent.com
Fri May 13 14:05:47 EDT 2011


Sean,

Thanks to you and Ben for clarifying how that works.  Since that was 
so helpful, I'll ask a followup question, and also a question on 
a mostly un-related topic...

1) When I've removed a couple of nodes and the remaining nodes pick up 
the slack, is there any way for me to look under the hood and see that?
I'm using wget to fetch the '.../stats' URL from one of the remaing 
live nodes, and under ring_ownership it still lists the original 4
nodes, each one owning 1/4 or the total partitions.   That's part of
reason why I didn't think the data ownership had been moved.

2) My test involves sending a large number of read/write requests to the 
cluster from multiple client connections and timing how long each request
takes.   I find that the vast majority of the requests are processed 
quickly (a few milliseconds to 10s of milliseconds).  However, every once
in while, the server seems to "hang" for a while.  When that happens
the response can take several hundred milliseconds or even several 
seconds.   Is this something that is known and/or expected?   There 
doesn't seem to be any pattern to how often it happens -- typically 
I'll see it a "few" times during a 10-minute test run.   Sometimes
it will go for several minutes without a problem.   I haven't ruled
out a problem with my test client, but it's fairly simple-minded C++
program using the protocol buffers interface, so I don't think there
is too much that can go wrong on that end.

Thanks again for your help!


On Fri, May 13, 2011 at 12:06:06PM -0400, Sean Cribbs wrote:
> Peter,
> 
> You've hit on a major feature of Riak: to be available in the face of network and hardware failure.  
> 
> When a node is down, other nodes (ones that do not "own" the replicas for a given key) will pick up the slack and serve read and write requests on behalf of the downed node.  This means that while the node(s) is down, you could write a key to the cluster and read it back while still satisfying quorum.  The standard quorum considers fallback nodes to be as valid as non-fallbacks (we're also in the process of implementing a way for you to be more strict about that, if you so desire).  When the downed nodes return, writes that were sent to fallbacks are returned to their proper owners via hinted handoff.
> 
> This feature lets your application that uses Riak stay available (even if in a degraded state), despite multiple failures. We consider this A Good Thing.
> 
> Sean Cribbs <sean at basho.com>
> Developer Advocate
> Basho Technologies, Inc.
> http://basho.com/
> 
> On May 13, 2011, at 11:13 AM, Peter Fales wrote:
> 
> > 
> > I'm a Riak newbie, trying to get some familiarity with the system by
> > runing some tests on Amazon EC2.   I'm seeing some behavior that I don't 
> > understand...
> > 
> > I've set up a test where I create a 4-node cluster using 4 EC2 machines.
> > I've created a bucket with n_val=4, r=quorum, and w=quorum.   For
> > n_val=4, the quorum should be 3, so I thought I would have to have at
> > least 3 nodes in service for my read and write operations to succeed.
> > During my test, I start sending read/write requests to two of the nodes
> > (and I see the CPU load go up on all four nodes, so I know they are
> > talking to each other).  Then I reboot the other two nodes.  At that 
> > point, I was expecting the reads and writes to start failing, but in 
> > fact I usually don't see any problems.  (sometimes the query that is 
> > in progress at the time may fail or timeout, but if I establish a new
> > connection to the server, and start sending read/write requests again,
> > those requests will go through, even with only two of the 4 nodes in service)
> > 
> > I suspect I'm just missing something obvious, but I don't understand how
> > I can run with just two nodes.  What am I missing?
> > 
> > -- 
> > Peter Fales
> > Alcatel-Lucent
> > Member of Technical Staff
> > 1960 Lucent Lane
> > Room: 9H-505
> > Naperville, IL 60566-7033
> > Email: Peter.Fales at alcatel-lucent.com
> > Phone: 630 979 8031
> > 
> > _______________________________________________
> > riak-users mailing list
> > riak-users at lists.basho.com
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

-- 
Peter Fales
Alcatel-Lucent
Member of Technical Staff
1960 Lucent Lane
Room: 9H-505
Naperville, IL 60566-7033
Email: Peter.Fales at alcatel-lucent.com
Phone: 630 979 8031




More information about the riak-users mailing list