riak crash

Matthew Von-Maszewski matthewv at basho.com
Mon Feb 22 10:40:21 EST 2016


Raviraj,

Please run 'riak-debug'.  This is in the bin directory along with 'riak start' and 'riak-admin'.

riak-debug will produce a file named similar to /home/user/riak at 10.0.0.15-riak-debug.tar.gz <mailto:home/user/riak at 10.0.0.15-riak-debug.tar.gz>

You should email that file to me directly, or post it to dropbox or similar and send me a link.  You do not want to send that file to the entire mailing list.

I will review the file and suggest next steps.

Matthew

> On Feb 22, 2016, at 5:13 AM, Raviraj Vaishampayan <rvaishampayan at vmware.com> wrote:
> 
> Hi,
> 
> We have been using riak to gather our test data and analyze results after test completes.
> Recently we have observed riak crash in riak console logs.
> This causes our tests failing to record data to riak and bailing out :-(
> 
> The crash logs are as follow:
> 2016-02-19 16:25:26.255 [error] <0.2160.0> gen_fsm <0.2160.0> in state active terminated with reason: no function clause matching riak_kv_vnode:handle_info({#Ref<0.0.482.161540>,{ok,<0.11042.842>}}, {state,268322566228720457638957762256505085639956365312,riak_kv_eleveldb_backend,true,{state,<<>>,...},...}) line 1195
> 2016-02-19 16:25:26.260 [error] <0.2160.0> CRASH REPORT Process <0.2160.0> with 2 neighbours exited with reason: no function clause matching riak_kv_vnode:handle_info({#Ref<0.0.482.161540>,{ok,<0.11042.842>}}, {state,268322566228720457638957762256505085639956365312,riak_kv_eleveldb_backend,true,{state,<<>>,...},...}) line 1195 in gen_fsm:terminate/7 line 622
> 2016-02-19 16:25:26.260 [error] <0.172.0> Supervisor riak_core_vnode_sup had child undefined started with {riak_core_vnode,start_link,undefined} at <0.2160.0> exit with reason no function clause matching riak_kv_vnode:handle_info({#Ref<0.0.482.161540>,{ok,<0.11042.842>}}, {state,268322566228720457638957762256505085639956365312,riak_kv_eleveldb_backend,true,{state,<<>>,...},...}) line 1195 in context child_terminated
> 2016-02-19 16:25:26.261 [error] <0.4319.0> gen_fsm <0.4319.0> in state ready terminated with reason: no function clause matching riak_kv_vnode:handle_info({#Ref<0.0.482.161540>,{ok,<0.11042.842>}}, {state,268322566228720457638957762256505085639956365312,riak_kv_eleveldb_backend,true,{state,<<>>,...},...}) line 1195
> 2016-02-19 16:25:26.275 [error] <0.4319.0> CRASH REPORT Process <0.4319.0> with 10 neighbours exited with reason: no function clause matching riak_kv_vnode:handle_info({#Ref<0.0.482.161540>,{ok,<0.11042.842>}}, {state,268322566228720457638957762256505085639956365312,riak_kv_eleveldb_backend,true,{state,<<>>,...},...}) line 1195 in gen_fsm:terminate/7 line 622
> 2016-02-19 16:25:26.278 [error] <0.4320.0> Supervisor {<0.4320.0>,poolboy_sup} had child riak_core_vnode_worker started with riak_core_vnode_worker:start_link([{worker_module,riak_core_vnode_worker},{worker_args,[268322566228720457638957762256505085639956365312,...]},...]) at undefined exit with reason no function clause matching riak_kv_vnode:handle_info({#Ref<0.0.482.161540>,{ok,<0.11042.842>}}, {state,268322566228720457638957762256505085639956365312,riak_kv_eleveldb_backend,true,{state,<<>>,...},...}) line 1195 in context shutdown_error
> 2016-02-19 16:25:26.278 [error] <0.4320.0> gen_server <0.4320.0> terminated with reason: no function clause matching riak_kv_vnode:handle_info({#Ref<0.0.482.161540>,{ok,<0.11042.842>}}, {state,268322566228720457638957762256505085639956365312,riak_kv_eleveldb_backend,true,{state,<<>>,...},...}) line 1195
> 2016-02-19 16:25:26.278 [error] <0.4320.0> CRASH REPORT Process <0.4320.0> with 0 neighbours exited with reason: no function clause matching riak_kv_vnode:handle_info({#Ref<0.0.482.161540>,{ok,<0.11042.842>}}, {state,268322566228720457638957762256505085639956365312,riak_kv_eleveldb_backend,true,{state,<<>>,...},...}) line 1195 in gen_server:terminate/6 line 744
> 2016-02-19 16:25:26.806 [error] <0.2157.0> gen_fsm <0.2157.0> in state active terminated with reason: {timeout,{gen_server,call,[<0.5141.0>,stop]}}
> 2016-02-19 16:25:26.808 [error] <0.2157.0> CRASH REPORT Process <0.2157.0> with 2 neighbours exited with reason: {timeout,{gen_server,call,[<0.5141.0>,stop]}} in gen_fsm:terminate/7 line 600
> 2016-02-19 16:25:26.809 [error] <0.5450.0> gen_fsm <0.5450.0> in state ready terminated with reason: {timeout,{gen_server,call,[<0.5141.0>,stop]}}
> 2016-02-19 16:25:26.809 [error] <0.172.0> Supervisor riak_core_vnode_sup had child undefined started with {riak_core_vnode,start_link,undefined} at <0.2157.0> exit with reason {timeout,{gen_server,call,[<0.5141.0>,stop]}} in context child_terminated
> 2016-02-19 16:25:26.809 [error] <0.5450.0> CRASH REPORT Process <0.5450.0> with 10 neighbours exited with reason: {timeout,{gen_server,call,[<0.5141.0>,stop]}} in gen_fsm:terminate/7 line 622
> 2016-02-19 16:25:26.809 [error] <0.5451.0> Supervisor {<0.5451.0>,poolboy_sup} had child riak_core_vnode_worker started with riak_core_vnode_worker:start_link([{worker_module,riak_core_vnode_worker},{worker_args,[211232658520482062396626323478525280184646500352,...]},...]) at undefined exit with reason {timeout,{gen_server,call,[<0.5141.0>,stop]}} in context shutdown_error
> 2016-02-19 16:25:26.809 [error] <0.5451.0> gen_server <0.5451.0> terminated with reason: {timeout,{gen_server,call,[<0.5141.0>,stop]}}
> 2016-02-19 16:25:26.809 [error] <0.5451.0> CRASH REPORT Process <0.5451.0> with 0 neighbours exited with reason: {timeout,{gen_server,call,[<0.5141.0>,stop]}} in gen_server:terminate/6 line 744
> 
> Our setup is as follow:
> We have a riak cluster with 10 nodes, configuration of each node is as follow:
> RAM: 48GB
> Disk:
>          80GB (/)
>          504GB (separate riak partition)
> Riak Version: 2.1.3-1 (2.1.3)
> Data in riak: After observing crash, total data in riak partition was ~50GB
> 
> Riak config is as follow:
> riak.conf
> [Attached with this email]
> 
> advanced.config:
> [
>  {riak_kv, [{add_paths, ["/usr/local/lib/scale_riak/ebin"]}]},
>  {webmachine, [{backlog, 511}, {nodelay, true}]},
>  {yokozuna, [{solr_request_timeout, 120000}]}
> ].
> 
> We have observed this a few times now, and after this crash we observed latency increases and our application starts timing out.
> We would really like to understand what might be causing this crash and if it is something due to missing config on our nodes we would like to fix it.
> 
> Thanks for your help in advanced :-)
> 
> Regards,
> Raviraj
> <riak.conf>_______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20160222/6aa37557/attachment-0002.html>


More information about the riak-users mailing list