Missing primary partitions

Mark Phillips mark at basho.com
Wed Oct 23 14:58:12 EDT 2013


Hi Sergey,

This looks like the initial tcp connection is timing out when riak at de5
and riak at de6 are first trying to talk to the handoff/ip port for
riak at de2 (which would be configured by in riak at de2's app.config).

You may have already gotten to the bottom of why that's happening, but
the first thing to try would be to set the cluster handoff_concurrency
limit to "0" and then back to the default to interrupt and restart any
in-progress transfers and then watch the network traffic on the
handoff ports. You can do this with "riak-admin transfer-limit" [1].
If you don't specify a "node" (as shown in the docs) it will set it
for the whole cluster (which is what you want to do).

Hope that helps. Keep us posted.

Mark

[1] http://docs.basho.com/riak/latest/ops/running/tools/riak-admin/#transfer-limit




On Tue, Oct 22, 2013 at 2:27 AM,  <fenix.serega at gmail.com> wrote:
> Hi all
>
> What to do in case of loss of the primary partitions !?
>
> 6 node cluster, leveldb, 1.3.2
>
> 5-6 nodes always waiting to handoff 46 partitions
>
> 'riak at de6' waiting to handoff 46 partitions
> 'riak at de5' waiting to handoff 46 partitions
>
> Active Transfers:
>
> transfer type: hinted_handoff
> vnode type: riak_kv_vnode
> partition: 919147514102638163401536164325474867830488825856
> started: 2013-10-22 08:18:35 [-131940361.00 us ago]
> last update: no updates seen
> objects transferred: unknown
>
>                          unknown
> riak at de5 =======================> riak at de2
>                          unknown
>
> transfer type: hinted_handoff
> vnode type: riak_kv_vnode
> partition: 667951920186389224335277833702363723827125420032
> started: 2013-10-22 08:18:40 [-136954679.00 us ago]
> last update: no updates seen
> objects transferred: unknown
>
>                          unknown
> riak at de6 =======================> riak at de2
>                          unknown
>
> .......
>
>
> How to fix/disable these handoffs and errors:
>
> 2013-10-21 23:59:56.894 [error]
> <0.9317.693>@riak_core_handoff_sender:start_fold:226 hinted_handoff transfer
> of riak_kv_vnode from 'riak at de5'
> 987655403352524237692333890859050634376860663808 to 'riak at de2'
> 987655403352524237692333890859050634376860663808 failed because of
> error:{badmatch,{error,timeout}}
> [{riak_core_handoff_sender,start_fold,5,[{file,"src/riak_core_handoff_sender.erl"},{line,101}]}]
> ....
>
> Thanks,
> Sergey
>
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>




More information about the riak-users mailing list