consistent hashing -- re-adding a node/replica

Brian Hammond brian at
Mon Nov 30 09:16:44 EST 2009

Thanks for the reply. That clears things up for me.

On Nov 30, 2009, at 9:03 AM, Justin Sheehy <justin at> wrote:

> Hi Brian,
> Sorry it took a while for you to see a response on this.
> On Tue, Oct 27, 2009 at 7:15 PM, Brian Hammond  
> <brian at> wrote:
>> What happens if you take a server out of the hash ring and then re- 
>> add that
>> same server later?
> This very much depends on what you mean by "take a server out of the
> hash ring".  There are two things you could be thinking of:
> 1) You turn off or otherwise disconnect a server from the rest of the
> cluster, making it unreachable.
> 2) You explicitly do a remove_from_cluster, causing the server to give
> up its ownership of any ring partitions.
>> Won't the data it held before being removed now potentially be  
>> "stale" as
>> the keys it previously "owned" have been shuffled around to other
>> nodes/replicas?  Since the offline node is now back online, the  
>> same keys
>> that hashed to a different node it its absence will hash to the  
>> original
>> node again.  Thus, a read of such a key will potentially produce  
>> stale data.
>>  But that clearly cannot be the case.
> The confusion here comes from a slight mixup of the two alternatives I
> posed above.  It is intentional in the design of Riak that there is an
> essential difference between a machine becoming unreachable (#1) and a
> machine no longer being a logical part of the cluster (#2).  Case #2
> is the situation where owned key ranges will be moved to other nodes,
> and in that same case the node will not own anything when it comes
> back.  Rejoining the cluster after an explicit removal is no different
> than joining for the first time: the node will claim a fresh set of
> partitions whose data will then be moved to that node.
> On the other hand, a machine might simply become unreachable but be
> left logically in the cluster: choice #1.  In that case, no ownership
> will change as the node is expected to hopefully rejoin at some point.
> It is for continued operation in this case that you usually want your
> R and W parameters to be less than N.
> I hope that this helped with your understanding.
> -Justin

More information about the riak-users mailing list