Erlang MR bulk fetch
dzagidulin at basho.com
Wed Feb 6 15:16:29 EST 2013
How large are the objects that you're working with?
As part of a previous project, we ran some benchmarks on bulk fetches --
A/B testing two different options. Option one was fetching the objects via
MapReduce (as you are trying to do), and option two was issuing a bunch of
Counter-intuitively, it turned out that for any but the smallest object
sizes, it was faster to bulk retrieve objects using GETs instead of the M/R.
So, may be worth reconsidering that route, in your case.
(We also ended up having to write a custom Erlang M/R function to
base64-encode binary object values, since they crash the JSON results, as
On Sun, Dec 2, 2012 at 8:26 PM, Elias Levy <fearsome.lucidity at gmail.com>wrote:
> While processing a request I am finding that I am spending most of my time
> fetching data from Riak. As I am using the Ruby client I can't
> parallelise them, as it does not support non-blocking IO and I don't want
> to spawn many threads.
> For these requests I am willing to accept an R of 1, so I can use a MR job
> as a bulk fetch, but I would prefer to avoid the JS VM, as its rather
> slow and I don't want to worry about running out of VMs at the wrong time.
> I can easily fetch the values using only Erlang MR by adding the keys to
> my MR object and doing
> map(["riak_kv_mapreduce", "map_object_value"], :arg => [ true ],
> :keep => true).
> but this gives me an array of values. I need to to know the which values
> are associated with what key, and my values do not include the key or
> otherwise allow me to map them.
> From looking at the Riak code there does not seem to exist an exported
> Erlang map function that will return a value with its key.
> Did I miss one? Is such a function somewhere in the source? If not,
> anyone have one handy they'd like to share?
> riak-users mailing list
> riak-users at lists.basho.com
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the riak-users