Map phase timeout

Christian Dahlqvist christian at basho.com
Mon Apr 8 04:27:05 EDT 2013


Hi Matt,

If you have a complicated mapreduce job containing multiple phases implemented in JavaScript, you will most likely see a lot of contention for the JavaScript VMs which will cause problems. While you can tune the configuration [1], you may find that you will need a very large pool size in order to properly support your job, especially for map phases as these run in parallel.

The best way to speed up the mapreduce job and get around the VM pool contention is to implement the mapreduce functions in Erlang.

Best regards,

Christian 

[1] http://docs.basho.com/riak/1.2.0/references/appendices/MapReduce-Implementation/#Configuration-Tuning-for-Javascript



--------------------
Christian Dahlqvist
Client Services Engineer
Basho Technologies
EMEA Office
E-mail: christian at basho.com
Skype: c.dahlqvist
Mobile: +44 7890 590 910

On 8 Apr 2013, at 08:20, Matt Black <matt.black at jbadigital.com> wrote:

> Thanks for the reply, Christian.
> 
> I didn't explain well enough in my first post - the map reduce operation is merely loading a bunch of objects, and a Python script which makes the connection to Riak then will write these objects to disk. (It's probably obvious, but I'm using javascript and riak python client.)
> 
> The query itself has many map phases where a composite object is built up from related objects spread across many buckets.
> 
> I was hoping there may be some kind of timeout I could adjust on a per-map phase basis - clutching at straws really.
> 
> Cheers
> Matt
> 
> 
> On 8 April 2013 17:14, Christian Dahlqvist <christian at basho.com> wrote:
> Hi,
> 
> Without having access to the mapreduce functions you are running, I would assume that a mapreduce job both writing data to disk as well as deleting the written record from Riak might be quite slow. This is not really a use case mapreduce was designed for, and when a mapreduce job crashes or times out it is difficult to know how far along the processing of different records it got.
> 
> I would therefore recommend considering running this type of archiving and delete job as an external batch process instead as it will give you better control over the execution and avoid timeout problems.
> 
> Best regards,
> 
> Christian
> 
> 
> 
> On 8 Apr 2013, at 00:49, Matt Black <matt.black at jbadigital.com> wrote:
> 
> > Dear list,
> >
> > I'm currently getting a timeout during a single phase of a multi-phase map reduce query. Is there anything I can do to assist this in running?
> >
> > It's purpose is to backup and remove objects from Riak, so it will run periodically during quiet times moving old data out of Riak into file storage.
> >
> > Traceback (most recent call last):
> >   File "./tools/rolling_backup.py", line 185, in <module>
> >     main()
> >   File "./tools/rolling_backup.py", line 181, in main
> >     args.func(**kwargs)
> >   File "/srv/backup/tools/mapreduce.py", line 295, in do_map_reduce
> >     raise e
> > Exception: {"phase":2,"error":"timeout","input":"[<<\"cart-products\">>,<<\"cd67d7f6e2688bc2089e6fa79506ac05-2\">>,{struct,[{<<\"uid\">>,<<\"cd67d7f6e2688bc2089e6fa79506ac05\">>},{<<\"cart\">>,{struct,[{<<\"expired_ts\">>,<<\"2013-03-05T19:12:23.906228\">>},{<<\"last_updated\">>,<<\"2013-03-05T19:12:23.906242\">>},{<<\"tags\">>,{struct,[{<<\"type\">>,<<\"AB\">>}]}},{<<\"completed\">>,false},{<<\"created\">>,<<\"2013-03-04T02:10:18.638413\">>},{<<\"products\">>,[{struct,[{<<\"cost\">>,0},{<<\"bundleName\">>,<<\"Product\">>},...]},...]},...]}},...]}]","type":"exit","stack":"[{riak_kv_w_reduce,'-js_runner/1-fun-0-',3,[{file,\"src/riak_kv_w_reduce.erl\"},{line,283}]},{riak_kv_w_reduce,reduce,3,[{file,\"src/riak_kv_w_reduce.erl\"},{line,206}]},{riak_kv_w_reduce,maybe_reduce,2,[{file,\"src/riak_kv_w_reduce.erl\"},{line,157}]},{riak_pipe_vnode_worker,process_input,3,[{file,\"src/riak_pipe_vnode_worker.erl\"},{line,444}]},{riak_pipe_vnode_worker,wait_for_input,2,[{file,\"src/riak_pipe_vnode_worker.erl\"},{line,376}]},{gen_fsm,handle_msg,7,[{file,\"gen_fsm.erl\"},{line,494}]},{proc_lib,...}]"}
> >
> >
> > _______________________________________________
> > riak-users mailing list
> > riak-users at lists.basho.com
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20130408/aaa64133/attachment.html>


More information about the riak-users mailing list