Riak riak_kv_vnode worker pool crashed

Kelly McLaughlin kelly at basho.com
Fri Oct 12 12:29:47 EDT 2012


Mikhail,

I am familiar with this error. I need to understand more of your situation before I can make any recommendations. Can you describe more about what you are doing when this happens? How often are you running secondary index queries? How many objects are in the bucket you're querying against? What's the size of your cluster (both nodes and vnodes)? 

Kelly




On Oct 10, 2012, at 10:29 PM, Mikhail Kuznetsov <kuznetsov.m.yu at gmail.com> wrote:

> I deploy a test stand for server app demo for clients. Riak 1.2.0 on debian compiled and run with erlang 15b01. Client uses riak erlang protobuf client 1.2. I have to use only one node with almost default config. For storage backend I use eleveldb_backend (we use second indexing very often). After several dais of mostly standby wiating (demo application showed once in day or two) riak server start dropping connections (erlang pb client is not notified and show that connection is ok). I look at riak log and find that entries:
> 
> 2012-09-20 00:10:10.976 [error] <0.803.0>@riak_core_vnode:handle_info:510 296867520082839655260123481645494988367611297792 riak_kv_vnode worker pool crashed {timeout,{gen_server,call,[<0.819.0>,{work,<0.806.0>,{fold,#Fun,#Fun},{raw,59205031,<0.28969.11>}}]}} 2012-09-20 00:10:10.976 [error] <0.862.0>@riak_core_vnode:handle_info:510 365375409332725729550921208179070754913983135744 riak_kv_vnode worker pool crashed {timeout,{gen_fsm,sync_send_event,[<0.866.0>,{checkout,false,5000},5000]}}
> 
> ...
> 
> 2012-09-20 00:17:14.234 [error] <0.1645.0> CRASH REPORT Process <0.1645.0> with 0 neighbours exited with reason: {timeout,{gen_fsm,sync_send_event,[<0.1646.0>,{checkout,false,5000},5000]}} in gen_fsm:terminate/7 line 611 2012-09-20 00:17:14.639 [error] <0.2337.0>@riak_api_pb_server:handle_info:171 Unrecognized message {pipe_log,#Ref<0.0.17.19772>,index,{trace,[error],{vnode,{fitting_died,1118962191081472546749696200048404186924073353216}}}} 2012-09-20 00:17:15.106 [error] <0.1596.0> gen_fsm <0.1596.0> in state ready terminated with reason: {timeout,{gen_fsm,sync_send_event,[<0.1597.0>,{checkout,false,5000},5000]}}
> 
> If client make request it gets {error, disconnected}. If we try reconnect it successfully connected and work.
> 
> What should I change in config file or tune client connection to make it work stable? I fully understand that ring is good and haproxy can check nodes(and maybe break connection in this case), but I have only one node for riak on demo server and haproxy is not installed. Can you advise something?
> 
> 
> 
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20121012/04db66af/attachment.html>


More information about the riak-users mailing list