riak crash

Raviraj Vaishampayan rvaishampayan at vmware.com
Mon Feb 22 05:13:02 EST 2016


Hi,

We have been using riak to gather our test data and analyze results after test completes.
Recently we have observed riak crash in riak console logs.
This causes our tests failing to record data to riak and bailing out :-(

The crash logs are as follow:
2016-02-19 16:25:26.255 [error] <0.2160.0> gen_fsm <0.2160.0> in state active terminated with reason: no function clause matching riak_kv_vnode:handle_info({#Ref<0.0.482.161540>,{ok,<0.11042.842>}}, {state,268322566228720457638957762256505085639956365312,riak_kv_eleveldb_backend,true,{state,<<>>,...},...}) line 1195

2016-02-19 16:25:26.260 [error] <0.2160.0> CRASH REPORT Process <0.2160.0> with 2 neighbours exited with reason: no function clause matching riak_kv_vnode:handle_info({#Ref<0.0.482.161540>,{ok,<0.11042.842>}}, {state,268322566228720457638957762256505085639956365312,riak_kv_eleveldb_backend,true,{state,<<>>,...},...}) line 1195 in gen_fsm:terminate/7 line 622

2016-02-19 16:25:26.260 [error] <0.172.0> Supervisor riak_core_vnode_sup had child undefined started with {riak_core_vnode,start_link,undefined} at <0.2160.0> exit with reason no function clause matching riak_kv_vnode:handle_info({#Ref<0.0.482.161540>,{ok,<0.11042.842>}}, {state,268322566228720457638957762256505085639956365312,riak_kv_eleveldb_backend,true,{state,<<>>,...},...}) line 1195 in context child_terminated

2016-02-19 16:25:26.261 [error] <0.4319.0> gen_fsm <0.4319.0> in state ready terminated with reason: no function clause matching riak_kv_vnode:handle_info({#Ref<0.0.482.161540>,{ok,<0.11042.842>}}, {state,268322566228720457638957762256505085639956365312,riak_kv_eleveldb_backend,true,{state,<<>>,...},...}) line 1195

2016-02-19 16:25:26.275 [error] <0.4319.0> CRASH REPORT Process <0.4319.0> with 10 neighbours exited with reason: no function clause matching riak_kv_vnode:handle_info({#Ref<0.0.482.161540>,{ok,<0.11042.842>}}, {state,268322566228720457638957762256505085639956365312,riak_kv_eleveldb_backend,true,{state,<<>>,...},...}) line 1195 in gen_fsm:terminate/7 line 622

2016-02-19 16:25:26.278 [error] <0.4320.0> Supervisor {<0.4320.0>,poolboy_sup} had child riak_core_vnode_worker started with riak_core_vnode_worker:start_link([{worker_module,riak_core_vnode_worker},{worker_args,[268322566228720457638957762256505085639956365312,...]},...]) at undefined exit with reason no function clause matching riak_kv_vnode:handle_info({#Ref<0.0.482.161540>,{ok,<0.11042.842>}}, {state,268322566228720457638957762256505085639956365312,riak_kv_eleveldb_backend,true,{state,<<>>,...},...}) line 1195 in context shutdown_error

2016-02-19 16:25:26.278 [error] <0.4320.0> gen_server <0.4320.0> terminated with reason: no function clause matching riak_kv_vnode:handle_info({#Ref<0.0.482.161540>,{ok,<0.11042.842>}}, {state,268322566228720457638957762256505085639956365312,riak_kv_eleveldb_backend,true,{state,<<>>,...},...}) line 1195

2016-02-19 16:25:26.278 [error] <0.4320.0> CRASH REPORT Process <0.4320.0> with 0 neighbours exited with reason: no function clause matching riak_kv_vnode:handle_info({#Ref<0.0.482.161540>,{ok,<0.11042.842>}}, {state,268322566228720457638957762256505085639956365312,riak_kv_eleveldb_backend,true,{state,<<>>,...},...}) line 1195 in gen_server:terminate/6 line 744

2016-02-19 16:25:26.806 [error] <0.2157.0> gen_fsm <0.2157.0> in state active terminated with reason: {timeout,{gen_server,call,[<0.5141.0>,stop]}}

2016-02-19 16:25:26.808 [error] <0.2157.0> CRASH REPORT Process <0.2157.0> with 2 neighbours exited with reason: {timeout,{gen_server,call,[<0.5141.0>,stop]}} in gen_fsm:terminate/7 line 600

2016-02-19 16:25:26.809 [error] <0.5450.0> gen_fsm <0.5450.0> in state ready terminated with reason: {timeout,{gen_server,call,[<0.5141.0>,stop]}}

2016-02-19 16:25:26.809 [error] <0.172.0> Supervisor riak_core_vnode_sup had child undefined started with {riak_core_vnode,start_link,undefined} at <0.2157.0> exit with reason {timeout,{gen_server,call,[<0.5141.0>,stop]}} in context child_terminated

2016-02-19 16:25:26.809 [error] <0.5450.0> CRASH REPORT Process <0.5450.0> with 10 neighbours exited with reason: {timeout,{gen_server,call,[<0.5141.0>,stop]}} in gen_fsm:terminate/7 line 622

2016-02-19 16:25:26.809 [error] <0.5451.0> Supervisor {<0.5451.0>,poolboy_sup} had child riak_core_vnode_worker started with riak_core_vnode_worker:start_link([{worker_module,riak_core_vnode_worker},{worker_args,[211232658520482062396626323478525280184646500352,...]},...]) at undefined exit with reason {timeout,{gen_server,call,[<0.5141.0>,stop]}} in context shutdown_error

2016-02-19 16:25:26.809 [error] <0.5451.0> gen_server <0.5451.0> terminated with reason: {timeout,{gen_server,call,[<0.5141.0>,stop]}}

2016-02-19 16:25:26.809 [error] <0.5451.0> CRASH REPORT Process <0.5451.0> with 0 neighbours exited with reason: {timeout,{gen_server,call,[<0.5141.0>,stop]}} in gen_server:terminate/6 line 744

Our setup is as follow:
We have a riak cluster with 10 nodes, configuration of each node is as follow:
RAM: 48GB
Disk:
         80GB (/)
         504GB (separate riak partition)
Riak Version: 2.1.3-1 (2.1.3)
Data in riak: After observing crash, total data in riak partition was ~50GB

Riak config is as follow:
riak.conf
[Attached with this email]

advanced.config:

[

 {riak_kv, [{add_paths, ["/usr/local/lib/scale_riak/ebin"]}]},

 {webmachine, [{backlog, 511}, {nodelay, true}]},

 {yokozuna, [{solr_request_timeout, 120000}]}

].

We have observed this a few times now, and after this crash we observed latency increases and our application starts timing out.
We would really like to understand what might be causing this crash and if it is something due to missing config on our nodes we would like to fix it.

Thanks for your help in advanced :-)

Regards,
Raviraj
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20160222/4b1341f7/attachment-0002.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: riak.conf
Type: application/octet-stream
Size: 13914 bytes
Desc: riak.conf
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20160222/4b1341f7/attachment.conf>


More information about the riak-users mailing list