TCP recv timeout and handoffs almost all the time

Simon Effenberg seffenberg at team.mobile.de
Thu Jul 18 14:21:57 EDT 2013


Hi @list,

I see sometimes logs talking about "hinted_handoff transfer of .. failed because of TCP recv timeout".
Also riak-admin transfers shows me many handoffs (is it possible to give some insights about "how many" handoffs happened through "riak-admin status"?).

- Is it a normal behavior to have up to 30 handoffs from/to different nodes?
- How can I get down to the problem with the TCP recv timeout? I'm not sure if this is a network problem or if the other node is too slow. The load is ok on the machines (some IOwait but not 100%). Maybe interfering with AAE?

Here the log information about the TCP recv timeout. But that is not that often but handoffs happens really often:

2013-07-18 16:22:05.654 UTC [error] <0.28933.14>@riak_core_handoff_sender:start_fold:216 hinted_handoff transfer of riak_kv_vnode from 'riak at 10.46.109.207' 1118962191081472546749696200048404186924073353216 to 'riak at 10.46.109.205' 1118962191081472546749696200048404186924073353216 failed because of TCP recv timeout
2013-07-18 16:22:05.673 UTC [error] <0.202.0>@riak_core_handoff_manager:handle_info:282 An outbound handoff of partition riak_kv_vnode 1118962191081472546749696200048404186924073353216 was terminated for reason: {shutdown,timeout}


Thanks in advance
Simon




More information about the riak-users mailing list