Riak node crashes on startup

shaharke shahar at bigpandamedia.com
Sun Oct 7 04:29:08 EDT 2012


Hi,

Our production environment consists of 4 nodes cluster on 4 different
physical machines. The cluster has been healthy for quite a while now (for
over 3 weeks). The problem started when a simple M/R job, which worked
perfectly so far, persistently returned a timeout. Not being able to
understand the reason for this timeout I decided to try to restart all
cluster nodes (I know... a naive solution, but I didn't have in mind any
other). 3 out of 4 started successfully. However, one of the nodes
persistently crashes with the following error (from the crash.log):

2012-10-07 02:07:19 =ERROR REPORT====
** State machine <0.18176.885> terminating 
** Last message in was
{'$gen_sync_all_state_event',{<0.18174.885>,#Ref<0.0.3354.19148>},{shutdown,60000}}
** When State == ready
**      Data  == {state,{[],[]},<0.18177.885>,[],undefined}
** Reason for termination = 
** {timeout,{gen_fsm,sync_send_all_state_event,[<0.18177.885>,stop]}}
2012-10-07 02:07:55 =CRASH REPORT====
  crasher:
    initial call: riak_core_vnode:init/1
    pid: <0.28204.1745>
    registered_name: []
    exception exit:
{{{badmatch,{error,{badarg,[{erlang,binary_to_term,[<<131,108,0,0,0,12,104,4,104,3,109,0,0,0,11,117,115,101,114,115,45,108,105,107,101,115,109,0,0,0,5,108,105,107,101,115,109,0,0,0,19,64,73,114,105,115,104,67,101,108,116,105,99,65,112,112,97,114,101,108,109,0,0,0,9,56,48,54,52,54,52,50,55,50,110,7,1,49,91,194,174,62,203,4,100,0,9,117,110,100,101,102,105,110,101,100,104,4,104,3,109,0,0,0,11,117,115,101,114,115,45,108,105,107,101,115,109,0,0,0,5,108,105,107,101,115,109,0,0,0,6,64,97,114,98,121,115,109,0,0,0,9,56,48,54,52,54,52,50,55,50,110,7,1,49,91,194,174,62,203,4,100,0,9,117,110,100,101,102,105,110,101,100,104,4,104,3,109,0,0,0,11,117,115,101,114,115,45,108,105,107,101,115,109,0,0,0,5,108,105,107,101,115,109,0,0,0,5,64,78,67,73,83,109,0,0,0,9,56,48,54,52,54,52,50,55,50,110,7,1,49,91,194,174,62,203,4,100,0,9,117,110,100,101,102,105,110,101,100,104,4,104,3,109,0,0,0,11,117,115,101,114,115,45,108,105,107,101,115,109,0,0,0,5,108,105,107,101,115,109,0,0,0,7,64,77,97,114,118,101,108,109,0,0,0,9,56,48,54,52,54,52,50,55,50,110,7,1,49,91,194,174,62,203,4,100,0,9,117,110,100,101,102,105,110,101,100,104,4,104,3,109,0,0,0,11,117,115,101,114,115,45,108,105,107,101,115,109,0,0,0,5,108,105,107,101,115,109,0,0,0,16,35,49,49,51,48,54,51,49,54,53,51,55,52,50,57,57,109,0,0,0,9,56,48,54,52,54,52,50,55,50,110,7,1,49,91,194,174,62,203,4,100,0,9,117,110,100,101,102,105,110,101,100,104,4,104,3,109,0,0,0,11,117,115,101,114,115,45,108,105,107,101,115,109,0,0,0,5,108,105,107,101,115,109,0,0,0,11,64,98,116,116,102,115,101,114,105,101,115,109,0,0,0,9,56,48,54,52,54,52,50,55,50,110,7,1,49,91,194,174,62,203,4,100,0,9,117,110,100,101,102,105,110,101,100,104,4,104,3,109,0,0,0,11,117,115,101,114,115,45,108,105,107,101,115,109,0,0,0,5,108,105,107,101,115,109,0,0,0,5,64,83,111,110,121,109,0,0,0,9,56,48,54,52,54,52,50,55,50,110,7,1,49,91,194,174,62,203,4,100,0,9,117,110,100,101,102,105,110,101,100,104,4,104,3,109,0,0,0,11,117,115,101,114,115,45,108,105,107,101,115,109,0,0,0,5,108,105,107,101,115,109,0,0,0,14,64,77,111,110,115,116,101,114,69,110,101,114,103,121,109,0,0,0,9,56,48,54,52,54,52,50,55,50,110,7,1,49,91,194,174,62,203,4,100,0,9,117,110,100,101,102,105,110,101,100,104,4,104,3,109,0,0,0,11,117,115,101,114,115,45,108,105,107,101,115,109,0,0,0,5,108,105,107,101,115,109,0,0,0,17,64,87,111,114,100,115,87,105,116,104,70,114,105,101,110,100,115,109,0,0,0,9,56,48,54,52,54,52,50,55,50,110,7,1,49,91,194,174,62,203,4,100,0,9,117,110,100,101,102,105,110,101,100,104,4,104,3,109,0,0,0,11,117,115,101,114,115,45,108,105,107,101,115,109,0,0,0,5,108,105,107,101,115,109,0,0,0,13,64,116,104,101,105,110,113,117,105,115,105,116,114,109,0,0,0,9,56,48,54,52,54,52,50,55,50,110,7,1,49,91,194,174,62,203,4,100,0,9,117,110,100,101,102,105,110,101,100,104,4,104,3,109,0,0,0,11,117,115,101,114,115,45,108,105,107,101,115,109,0,0,0,5,108,105,107,101,115,109,0,0,0,10,64,77,101,116,97,108,108,105,99,97,109,0,0,0,9,56,48,54,52,54,52,50,55,50,110,7,1,49,91,194,174,62,203,4,100,0,9,117,110,100,101,102,105,110,101,100,104,4,104,3,109,0,0,0,11,117,115,101,114,115,0,0,0,90,131,108,0,0,0,1,104,4,104,3,109,0,0,0,11,117,115,101,114,115,45,108,105,107,101,115,109,0,0,0,5,108,105,107,101,115,109,0,0,0,12,64,97,109,99,116,104,101,97,116,114,101,115,109,0,0,0,9,56,49,52,55,49>>],[]},{mi_buffer,read_value,2,[{file,"src/mi_buffer.erl"},{line,162}]},{mi_buffer,open_inner,3,[{file,"src/mi_buffer.erl"},{line,70}]},{mi_buffer,new,1,[{file,"src/mi_buffer.erl"},{line,62}]},{mi_server,read_buffers,4,[{file,"src/mi_server.erl"},{line,605}]},{mi_server,read_buf_and_seg,1,[{file,"src/mi_server.erl"},{line,585}]},{mi_server,init,1,[{file,"src/mi_server.erl"},{line,143}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,304}]}]}}},[{merge_index_backend,start,2,[{file,"src/merge_index_backend.erl"},{line,47}]},{riak_search_vnode,init,1,[{file,"src/riak_search_vnode.erl"},{line,135}]},{riak_core_vnode,init,1,[{file,"src/riak_core_vnode.erl"},{line,123}]},{gen_fsm,init_it,6,[{file,"gen_fsm.erl"},{line,361}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,227}]}]},[{gen_fsm,init_it,6,[{file,"gen_fsm.erl"},{line,379}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,227}]}]}
    ancestors: [riak_core_vnode_sup,riak_core_sup,<0.27100.1745>]
    messages: []
    links: [<0.27106.1745>,<0.28205.1745>]
    dictionary: []
    trap_exit: true
    status: running
    heap_size: 987
    stack_size: 24
    reductions: 736
  neighbours:
    neighbour:
[{pid,<0.28205.1745>},{registered_name,[]},{initial_call,{mi_server,init,['Argument__1']}},{current_function,{erlang,process_info,2}},{ancestors,[<0.28204.1745>,riak_core_vnode_sup,riak_core_sup,<0.27100.1745>]},{messages,[]},{links,[<0.28204.1745>,#Port<0.70075163>]},{dictionary,[{random_seed,{4574,27586,21558}}]},{trap_exit,false},{status,suspended},{heap_size,6765},{stack_size,13},{reductions,521779}]

If required, I can attach the full log files from the crash event.

System environment:
Riak version: 1.2.0-1
Storage type: LevelDB
OS: Ubuntu 10.11
RAM: 8GB
CPU: Intel Xeon-SandyBridge E3-1270-Quadcore [3.4GHz]

Thanks,
Shahar



--
View this message in context: http://riak-users.197444.n3.nabble.com/Riak-node-crashes-on-startup-tp4025535.html
Sent from the Riak Users mailing list archive at Nabble.com.




More information about the riak-users mailing list