"consistent" map/reduce

Kresten Krab Thorup krab at trifork.com
Wed Nov 30 08:36:52 EST 2011


Hi Bryan, 

now I've implemented the bulk of this, and obviously have some follow-up questions:

1. How do I create the initial inputs? i.e. the list of all {Index, Node} pairs that go into the riak_kv_pipe_listkeys fitting.  Does this fitting need a special chashfun to send it to the right vnode?

2. Given such a pipeline-based thingie installed as .beam files with Riak, is there a way to "invoke" it via the HTTP M/R API?   It would be great if I don't have to poke a new whole through tcp/ip to exploit pipe.

3. Does riak_pipe run a fitting instance per node, or per vnode?

FYI, ... I'm considering doing a disk-based version of riak_pipe_w_reduce, which keeps the intermediate results in a local K/V store, in order to support large keysets.    This could just be a bitcask w/merge disabled.   Re-implementing the reducer would also allow us to evaluate the N>R condition in the reducer, and emit results as early as possible.


Kresten

Mobile: + 45 2343 4626 | Skype: krestenkrabthorup | Twitter: @drkrab
Trifork A/S  |  Margrethepladsen 4  | DK- 8000 Aarhus C |  Phone : +45 8732 8787  |  www.trifork.com
 





More information about the riak-users mailing list