Planning JS VM capacity, and how the JS VM count is intended to work (preflist_exhausted, etc)

Brian Conway bconway at rcesoftware.com
Mon Apr 30 17:25:30 EDT 2012


There are no shortage of references out there to avoiding a
preflist_exhausted error due to JS VM capacity being too small when
running MapReduce with JavaScript:

http://lists.basho.com/pipermail/riak-users_lists.basho.com/2012-March/thread.html#7853
https://github.com/basho/riak_kv/issues/284
http://wiki.basho.com/Configuration-Files.html#app.config

My question is, are there any good guidelines for planning JS VM
capacity? For example, if you're running a particularly wide MapReduce
with JS and hit the "All VMs are busy" message in your logs, the
prescribed advice is to bump up the JS counts until you no longer hit
it.

But isn't that just going to fail again as soon as you have more than
one M/R running at the same time?

Is it better to blindly increase the count into the thousands and hope
you cover all your bases, or set up a queuing mechanism outside of
Riak to fire off the MapReduce jobs at a rate that makes optimal use
of your hardware and JS VM count?

The MapReduce doc[1] touches briefly on configuration tuning for
JavaScript, but is there an option to have a pool configuration, where
MapReduce phases wait for a free JS VM when all are in use, rather
than erroring out?

I may have a fundamental misunderstanding on how this is supposed to
work, my apologies if so. Thanks in advance.

Brian Conway

[1] http://wiki.basho.com/MapReduce.html#Configuration-Tuning-for-Javascript




More information about the riak-users mailing list