consistent hashing -- re-adding a node/replica
brian at brianhammond.com
Mon Nov 30 09:16:44 EST 2009
Thanks for the reply. That clears things up for me.
On Nov 30, 2009, at 9:03 AM, Justin Sheehy <justin at basho.com> wrote:
> Hi Brian,
> Sorry it took a while for you to see a response on this.
> On Tue, Oct 27, 2009 at 7:15 PM, Brian Hammond
> <brian at brianhammond.com> wrote:
>> What happens if you take a server out of the hash ring and then re-
>> add that
>> same server later?
> This very much depends on what you mean by "take a server out of the
> hash ring". There are two things you could be thinking of:
> 1) You turn off or otherwise disconnect a server from the rest of the
> cluster, making it unreachable.
> 2) You explicitly do a remove_from_cluster, causing the server to give
> up its ownership of any ring partitions.
>> Won't the data it held before being removed now potentially be
>> "stale" as
>> the keys it previously "owned" have been shuffled around to other
>> nodes/replicas? Since the offline node is now back online, the
>> same keys
>> that hashed to a different node it its absence will hash to the
>> node again. Thus, a read of such a key will potentially produce
>> stale data.
>> But that clearly cannot be the case.
> The confusion here comes from a slight mixup of the two alternatives I
> posed above. It is intentional in the design of Riak that there is an
> essential difference between a machine becoming unreachable (#1) and a
> machine no longer being a logical part of the cluster (#2). Case #2
> is the situation where owned key ranges will be moved to other nodes,
> and in that same case the node will not own anything when it comes
> back. Rejoining the cluster after an explicit removal is no different
> than joining for the first time: the node will claim a fresh set of
> partitions whose data will then be moved to that node.
> On the other hand, a machine might simply become unreachable but be
> left logically in the cluster: choice #1. In that case, no ownership
> will change as the node is expected to hopefully rejoin at some point.
> It is for continued operation in this case that you usually want your
> R and W parameters to be less than N.
> I hope that this helped with your understanding.
More information about the riak-users