consistent hashing -- re-adding a node/replica

Justin Sheehy justin at basho.com
Mon Nov 30 09:03:56 EST 2009


Hi Brian,

Sorry it took a while for you to see a response on this.

On Tue, Oct 27, 2009 at 7:15 PM, Brian Hammond <brian at brianhammond.com> wrote:

> What happens if you take a server out of the hash ring and then re-add that
> same server later?

This very much depends on what you mean by "take a server out of the
hash ring".  There are two things you could be thinking of:

1) You turn off or otherwise disconnect a server from the rest of the
cluster, making it unreachable.

2) You explicitly do a remove_from_cluster, causing the server to give
up its ownership of any ring partitions.

> Won't the data it held before being removed now potentially be "stale" as
> the keys it previously "owned" have been shuffled around to other
> nodes/replicas?  Since the offline node is now back online, the same keys
> that hashed to a different node it its absence will hash to the original
> node again.  Thus, a read of such a key will potentially produce stale data.
>  But that clearly cannot be the case.

The confusion here comes from a slight mixup of the two alternatives I
posed above.  It is intentional in the design of Riak that there is an
essential difference between a machine becoming unreachable (#1) and a
machine no longer being a logical part of the cluster (#2).  Case #2
is the situation where owned key ranges will be moved to other nodes,
and in that same case the node will not own anything when it comes
back.  Rejoining the cluster after an explicit removal is no different
than joining for the first time: the node will claim a fresh set of
partitions whose data will then be moved to that node.

On the other hand, a machine might simply become unreachable but be
left logically in the cluster: choice #1.  In that case, no ownership
will change as the node is expected to hopefully rejoin at some point.
 It is for continued operation in this case that you usually want your
R and W parameters to be less than N.

I hope that this helped with your understanding.

-Justin




More information about the riak-users mailing list