bryan at basho.com
Thu Dec 1 07:38:58 EST 2011
On Wed, Nov 30, 2011 at 4:56 PM, Kresten Krab Thorup
<krab at trifork.com> wrote:> Now, it seems that if the items passing
between the map phase and reduce phase are many and small, then the
system could benefit from a "chunking" fitting that collects items and
sends them off in a list to the next fitting, after receiving X # of
items, or after some timeout (say, Y ms); i.e. basically Nagle's
algorithm Otherwise, we may get burned pretty seriously by the "sync
send" that happens between fittings !?
Indeed, a synchronous send for each object could be a real drag.
Achunking fitting may help, if it can alleviate any contention,
thoughyou'll still see a sync send per object inbound to that fitting.
Theworker behavior doesn't support timeout yet, so it will have to
bebased on number of items received for now.
I have two outstanding todos that may also help. The first isallowing
a worker to request all inputs in its queue, instead of justthe next
one, from its vnode. The second is a bulk-sending request,where a
fitting could say "send these 10 outputs" but a vnode mightreply back
"I only queued the first 5" (similar to the common iopattern "write N
bytes"=3D>"N-M bytes written"). They each have acouple of tricky
corner cases, and need benchmarking anyway, so I'mnot sure when
they'll make mainline yet.
More information about the riak-users