Understanding Riaks rebalancing and handoff behaviour

Sven Riedel sven.riedel at scoreloop.com
Thu Nov 11 01:59:55 EST 2010


Hi,
thanks for the detailed reply. So you would suggest that somehow the partition allocation got into an incosistent state across nodes. I'll have to check the logs to see if anything similar to your dump pops up.

> So I compared the ring states manually using the console, and in the
> ring state on the removed node quite a few partitions where assigned to
> different nodes than what the other nodes thought.
> After I manually synced the ring on the leaving node with the rest of
> the cluster by doing this on the console:
> 
> {ok,Ring} = rpc:call('riak at otherNode', riak_core_ring_manger,
> get_my_ring, []).
> riak_core_ring_manager:set_my_ring(R).
> 

That ought to be 

riak_core_ring_manager:set_my_ring( Ring ).

right? Just verifying because my Erlang is rather rudimentary :)

> Also riak-admin ringready will not recognize this problem, as far as I
> read the code, because only the ring states of the current ring members
> are compared. I haven't tried it, cause I am still on 0.12.0. 
> The same is apparently true for riak-admin transfers, which might tell
> you that there are no handoffs left, even if the removed node still has
> data.

I'm running 0.13.0, so if we're stumbling over the same cause it's still there.

> 
> I discovered another problem while debugging this. I you restart (or it
> crashes) a node that you removed from the cluster which still has data,
> it won't start handing off it's data afterwards. The reason being, that
> is the node watcher also does not get notified that the other nodes are
> up, and so all of them are considered down. This also can only be worked
> around manually via the erlang console.

Why would that have to be worked around at all? My understanding is through the data duplication within the ring having a single node encounter a messy and fatal accident shouldn't destabilize the entire ring.  The nodes which contain the duplicate data would just take over until a replacement node gets added, and the newly dead node is removed (ok, via console).

So this still leaves me with some of my original questions open:
>> 
>> 1. What would normally trigger a rebalancing of the nodes? 
>> 2. Is there a way to manually trigger a rebalancing?
>> 3. Did I do anything wrong with the procedure described above to be left in the current odd state by riak?

Regards,
Sven

------------------------------------------
Scoreloop AG, Brecherspitzstrasse 8, 81541 Munich, Germany, www.scoreloop.com
sven.riedel at scoreloop.com

Sitz der Gesellschaft: München, Registergericht: Amtsgericht München, HRB 174805 
Vorstand: Dr. Marc Gumpinger (Vorsitzender), Dominik Westner, Christian van der Leeden, Vorsitzender des Aufsichtsrates: Olaf Jacobi 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20101111/74869d13/attachment.html>


More information about the riak-users mailing list