Intermittent MapReduce crashes (reserve_vm)

Brian Conway bconway at rcesoftware.com
Sat Apr 28 01:38:17 EDT 2012


Any ideas? Using JS for MapReduce, I'm currently unable to do anything
except trivial tasks without these crashes, so I must be doing
something wrong. I'm fine with slow results (it is virtualized on
commodity hardware, after all), but intermittent VM errors don't seem
right. Thanks in advance.

Brian Conway

On Thu, Apr 26, 2012 at 12:23 AM, Brian Conway <bconway at rcesoftware.com> wrote:
> I have a test cluster of 3 nodes running locally (virtualized), with
> default configuration + eleveldb. The nodes have plenty of ram and
> never hit swap. I've already bumped up the JS VM count (8 -> 24) after
> getting preflist_exhausted errors, and I now get the follow
> intermittently when posting to /mapred:
>
> $ curl -X POST http://10.236.174.131:8098/mapred -H "Content-Type:
> application/json" -d @volume.js
> {"phase":3,"error":"{noproc,{gen_server,call,[riak_kv_js_map,{reserve_vm,<11534.1650.0>},infinity]}}","input":"{ok,{r_object,<<\"vol\">>,<<\"6724_2012-01-21_18\">>,[{r_content,{dict,4,16,16,8,80,48,{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},{{[],[],[],[],[],[],[],[],[],[],[[<<\"content-type\">>,97,112,112,108,105,99,97,116,105,111,110,47,106,115,111,110],[<<\"X-Riak-VTag\">>,52,68,111,85,117,98,80,99,106,114,79,106,71,115,107,118,85,67,88,117,68,107]],[[<<\"index\">>]],[],[[<<\"X-Riak-Last-Modified\">>|{1335,385687,399828}]],[],[]}}},<<\"{\"dlid\":
> \"1\", \"rate\": \"0.08\", \"cnid\":
> \"...\">>}],...},...}","type":"exit","stack":"[{gen_server,call,3},{riak_kv_js_manager,blocking_dispatch,4},{riak_kv_mrc_map,map_js,3},{riak_kv_mrc_map,process,3},{riak_pipe_vnode_worker,process_input,3},{riak_pipe_vnode_worker,wait_for_input,2},{gen_fsm,handle_msg,7},{proc_lib,init_p_do_apply,3}]"}
>
> This only seems to happen every two or three attempts, the rest
> complete successfully. Doing the same with Python and protocol buffers
> also gives inconsistent results. Those attempts sometimes work and
> sometimes throws off errors that are either the same as above, or like
> these (may be unrelated):
>
> ...
>  File "/home/bconway/scratch/riakenv/lib/python2.6/site-packages/riak/transports/pbc.py",
> line 535, in recv_pkt
>    % len(nmsglen))
> riak.RiakError: 'Socket returned short packet length 3 - expected 4'
>
> ...
>  File "/home/bconway/scratch/riakenv/lib/python2.6/site-packages/riak/transports/pbc.py",
> line 535, in recv_pkt
>    % len(nmsglen))
> riak.RiakError: 'Socket returned short packet length 1 - expected 4'
>
> The MapReduce itself is wide but fairly simple: 10 user bucket-key
> pairs, a few layers of links, and dump the final data:
>
> $ cat volume.js
> {"inputs":[["user","1672_2012-01"],["user","2672_2012-01"],["user","3672_2012-01"],["user","4672_2012-01"],["user","5672_2012-01"],["user","6672_2012-01"],["user","672_2012-01"],["user","6723_2012-01"],["user","6724_2012-01"],["user","6725_2012-01"]],
>  "query":[{"link":{"tag":"day"}},
>          {"link":{"tag":"usage"}},
>          {"link":{"tag":"contact"}},
>          {"map":{
>              "language":"javascript",
>              "name":"Riak.mapValuesJson"
>          }}
>         ]
> }
>
> The logs are fairly chatty, let me know what else I should add:
>
> ** Reason for termination ==
> ** {{{badmatch,[]},[{riak_kv_js_manager,needs_reload,2},{riak_kv_js_manager,handle_call,3},{gen_server,handle_msg,5},{proc_lib,init_p_do_apply,3}]},{gen_server,call,[riak_kv_js_map,{mark_idle,<0.1756.0>},infinity]}}
> 2012-04-26 00:18:18 =CRASH REPORT====
>  crasher:
>    initial call: riak_kv_js_vm:init/1
>    pid: <0.1756.0>
>    registered_name: []
>    exception exit:
> {{{badmatch,[]},[{riak_kv_js_manager,needs_reload,2},{riak_kv_js_manager,handle_call,3},{gen_server,handle_msg,5},{proc_lib,init_p_do_apply,3}]},{gen_server,call,[riak_kv_js_map,{mark_idle,<0.1756.0>},infinity]}}
>      in function  gen_server:terminate/6
>      in call from proc_lib:init_p_do_apply/3
>    ancestors: [riak_kv_js_sup,riak_kv_sup,<0.256.0>]
>    messages: [{'DOWN',#Ref<0.0.0.149247>,process,<0.1753.0>,{timeout,{gen_server,call,[<0.1764.0>,{checkout_to,<0.2736.0>},1000]}}}]
>    links: [<0.275.0>]
>    dictionary: []
>    trap_exit: false
>    status: running
>    heap_size: 1597
>    stack_size: 24
>    reductions: 627539
>  neighbours:
>
> Thanks for any help.
>
> Brian Conway




More information about the riak-users mailing list