ListKeys or MapReduce

OJ Reeves oj at buffered.io
Thu Feb 14 07:21:57 EST 2013


Chris,

I've never heard of do_prereduce before. What kind of effect does this
have? That is, if someone were to use it all the time, regardless of the
amount of data being returned, would this be a bad thing?

Thanks.
OJ

On Thu, Feb 14, 2013 at 6:19 PM, Christian Dahlqvist <christian at basho.com>wrote:

> Hi,
>
> For buckets with a significant number of records, it makes a lot of sense
> to run the example I provided with 'do_prereduce' enabled as it will result
> in considerably less data being sent between the nodes. This can be enabled
> as follows:
>
> curl -XPOST http://localhost:8098/mapred
>   -H 'Content-Type: application/json'
>   -d '{"inputs":{
>            "bucket":"goog",
>            "index":"$bucket",
>            "key":"goog"
>        },
>        "query":[{"reduce":{"language":"erlang",
>                            "module":"riak_kv_mapreduce",
>                            "function":"reduce_count_inputs",
>                            "arg":{"do_prereduce":true}}}]}'
>
> Best regards,
>
> Christian
>
>
> On 14 Feb 2013, at 08:01, Christian Dahlqvist <christian at basho.com> wrote:
>
> Hi Jeremiah,
>
> It does indeed not seem to be documented on the main docs site, and I will
> try to correct this. The only place I have found it described is on the
> wiki for the Ruby client (
> https://github.com/basho/riak-ruby-client/wiki/Secondary-Indexes).
>
> Below is also an example of a simple mapreduce job that shows how to count
> the number of records in the 'goog' bucket based on the $bucket secondary
> index:
>
> curl -XPOST http://localhost:8098/mapred
>   -H 'Content-Type: application/json'
>   -d '{"inputs":{
>            "bucket":"goof",
>            "index":"$bucket",
>            "key":"goof"
>        },
>        "query":[{"reduce":{"language":"erlang",
>                            "module":"riak_kv_mapreduce",
>                            "function":"reduce_count_inputs"}}]}'
>
> I hope this helps.
>
> Best regards,
>
> Christian
>
>
> On 13 Feb 2013, at 18:12, Jeremiah Peschka <jeremiah.peschka at gmail.com>
> wrote:
>
> Is this documented anywhere on the docs.basho.com site?
>
> Searching for $bucket produces search results just for "bucket" and
> Google says "No results found for *site:docs.basho.com $bucket*."
>
> ---
> Jeremiah Peschka - Founder, Brent Ozar Unlimited
> MCITP: SQL Server 2008, MVP
> Cloudera Certified Developer for Apache Hadoop
>
>
> On Wed, Feb 13, 2013 at 10:08 AM, Christian Dahlqvist <christian at basho.com
> > wrote:
>
>> Hi,
>>
>> In addition to the $key index, there is also a $bucket index available by
>> default. This contains the name of the bucket, and can be used to get all
>> keys in a specific bucket.
>>
>> Best regards,
>>
>> Christian
>>
>>
>
>
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>


-- 

OJ Reeves
+61 431 952 586
http://buffered.io/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20130214/f03707ef/attachment.html>


More information about the riak-users mailing list