Riak 1.2: stalled handoff

Yuri Lukyanov snaky at aboutecho.com
Sat Apr 6 02:20:54 EDT 2013


Backend is leveldb.

Ring status:

# riak-admin ring-status
Attempting to restart script through sudo -H -u riak
================================== Claimant
===================================
Claimant:  riak at nsto0r1
Status:     up
Ring Ready: true

============================== Ownership Handoff
==============================
Owner:      riak at nsto0r5
Next Owner: riak at nsto0r6

Index: 0
  Waiting on: [riak_kv_vnode]
  Complete:   [riak_pipe_vnode]

-------------------------------------------------------------------------------

============================== Unreachable Nodes
==============================
All nodes are up and reachable

# riak-admin ringready
Attempting to restart script through sudo -H -u riak
TRUE All nodes agree on the ring [riak at nsto0r0,riak at nsto0r1,riak at nsto0r2,
                                  riak at nsto0r3,riak at nsto0r4,riak at nsto0r5,
                                  riak at nsto0r6]


Unfortunately this is the current status. Things have changed since I wrote
the previous email . What I see now by running riak-admin transfers is:

# riak-admin transfers
Attempting to restart script through sudo -H -u riak
riak at nsto0r6 waiting to handoff 1 partitions
riak at nsto0r3 waiting to handoff 1 partitions
riak at nsto0r2 waiting to handoff 1 partitions
riak at nsto0r1 waiting to handoff 1 partitions

Active Transfers:

transfer type: ownership_handoff
vnode type: riak_kv_vnode
partition: 0
started: 2013-04-06 05:36:06 [1.93 min ago]
last update: 2013-04-06 06:08:11 [2.01 s ago]
objects transferred: 5184001

                       2692 Objs/s
    riak at nsto0r5 =======================>     riak at nsto0r6
                       931.18 KB/s

So it looks like the partition finally got being transferred. I don't
understand why there appear new handoffs to transfer though (maybe it's a
different story about riak at nsto0r1; look like it crashed for some reason).

Hm, is there a known reason why riak cluster may stop handoffs for a period
of time?

Thank you.


On Sat, Apr 6, 2013 at 2:42 AM, Alexander Moore <moore.alex at gmail.com>wrote:

> Also, what's the output of riak-admin ringready ?
>
> Thanks,
> Alex
>
>
> --Alex
>
>
> On Fri, Apr 5, 2013 at 6:37 PM, Alexander Moore <moore.alex at gmail.com>wrote:
>
>> Hi Yuri,
>>
>> What backend are you using?
>>
>> What does riak-admin ring-status output?
>>
>> Thanks,
>> Alex
>>
>>
>>
>>
>> On Fri, Apr 5, 2013 at 6:30 PM, Yuri Lukyanov <snaky at aboutecho.com>wrote:
>>
>>> Hi,
>>>
>>> I was adding the 7-th node to one of our riak 1.2 clusters. Everything
>>> was ok untill the process suddenly stopped with one handoff left:
>>>
>>> # riak-admin transfers
>>> Attempting to restart script through sudo -H -u riak
>>> riak at nsto0r6 waiting to handoff 1 partitions
>>>
>>> Active Transfers:
>>>
>>>
>>> Note that no transfers are displayed. At first I thought that it is just
>>> a temporary pause. But it's been already about 12 hours since then.
>>>
>>> Here is what member-status shows:
>>>
>>> # riak-admin member-status
>>> Attempting to restart script through sudo -H -u riak
>>> ================================= Membership
>>> ==================================
>>> Status     Ring    Pending    Node
>>>
>>> -------------------------------------------------------------------------------
>>> valid      14.1%     14.1%    riak at nsto0r0
>>> valid      14.1%     14.1%    riak at nsto0r1
>>> valid      14.1%     14.1%    riak at nsto0r2
>>> valid      14.1%     14.1%    riak at nsto0r3
>>> valid      15.6%     15.6%    riak at nsto0r4
>>> valid      *15.6%     14.1%*    riak at nsto0r5
>>> valid      *12.5%     14.1%*    riak at nsto0r6
>>>
>>> -------------------------------------------------------------------------------
>>> Valid:7 / Leaving:0 / Exiting:0 / Joining:0 / Down:0
>>>
>>>
>>> What would be a reason for such behaviour and how can I investigate this
>>> further?
>>> How to force the handoff?
>>>
>>> Thanks.
>>>
>>>
>>> _______________________________________________
>>> riak-users mailing list
>>> riak-users at lists.basho.com
>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20130406/04bc699c/attachment.html>


More information about the riak-users mailing list