Simultaneous handoff and merge

Yuri Lukyanov snaky at aboutecho.com
Thu Apr 18 05:07:52 EDT 2013


Hi,

I have a cluster of 17 riak (1.2.1) nodes with bitcask as a backend.

Recetly one of the node was down for a while. After the node had been
started the cluster started doing handoffs as expected. But then a merge
process also began on the same node. I know this from the log messages like
this:

2013-04-18 08:14:09.061 [info] <0.22952.79> Merged
["/var/lib/riak/bitcask/496682197061674038608283517368424307461195825152"


And then something went wrong (the logs on the same node):


2013-04-18 08:39:22.217 [error] <0.31842.70> Supervisor riak_core_vnode_sup
had child undefined started with {riak_core_vnode,start_link,undefined} at
<0.4000.80> exit with reason
{timeout,{gen_server,call,[riak_core_handoff_manager,{add_outbound,riak_kv_vnode,208378163135070142634509751539626289911881007104,riak at nsto2r5,<0.4000.80>}]}}
in context child_terminated

2013-04-18 08:42:46.067 [error] <0.5154.80> gen_server <0.5154.80>
terminated with reason:
{timeout,{gen_server,call,[riak_core_handoff_manager,{add_inbound,[]}]}}
2013-04-18 08:42:52.790 [error] <0.5154.80> CRASH REPORT Process
riak_core_handoff_listener with 1 neighbours exited with reason:
{timeout,{gen_server,call,[riak_core_handoff_manager,{add_inbound,[]}]}} in
gen_server:terminate/6 line 747
2013-04-18 08:42:53.450 [error] <0.31847.70> Supervisor
riak_core_handoff_listener_sup had child riak_core_handoff_listener started
with riak_core_handoff_listener:start_link() at <0.5154.80> exit with
reason
{timeout,{gen_server,call,[riak_core_handoff_manager,{add_inbound,[]}]}} in
context child_terminated


The node itself was disappearing from time to time:

# riak-admin ring-status
Node is not running!

The beam process was still running though.

Maybe it's not releated to handoffs & merge. It was just a guess.


Any information and advice on this would be greatly appriciated. It's still
happening right now and I could gather more details if someone wanted me to.

Thanks in advance.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20130418/3e60f493/attachment.html>


More information about the riak-users mailing list