ListKeys or MapReduce

Christian Dahlqvist christian at basho.com
Thu Feb 14 03:19:40 EST 2013


Hi,

For buckets with a significant number of records, it makes a lot of sense to run the example I provided with 'do_prereduce' enabled as it will result in considerably less data being sent between the nodes. This can be enabled as follows:

curl -XPOST http://localhost:8098/mapred 
  -H 'Content-Type: application/json' 
  -d '{"inputs":{
           "bucket":"goog",
           "index":"$bucket",
           "key":"goog"
       },
       "query":[{"reduce":{"language":"erlang",
                           "module":"riak_kv_mapreduce",
                           "function":"reduce_count_inputs", 
                           "arg":{"do_prereduce":true}}}]}'

Best regards,

Christian


On 14 Feb 2013, at 08:01, Christian Dahlqvist <christian at basho.com> wrote:

> Hi Jeremiah,
> 
> It does indeed not seem to be documented on the main docs site, and I will try to correct this. The only place I have found it described is on the wiki for the Ruby client (https://github.com/basho/riak-ruby-client/wiki/Secondary-Indexes).
>  
> Below is also an example of a simple mapreduce job that shows how to count the number of records in the 'goog' bucket based on the $bucket secondary index:
> 
> curl -XPOST http://localhost:8098/mapred 
>   -H 'Content-Type: application/json' 
>   -d '{"inputs":{
>            "bucket":"goof",
>            "index":"$bucket",
>            "key":"goof"
>        },
>        "query":[{"reduce":{"language":"erlang",
>                            "module":"riak_kv_mapreduce",
>                            "function":"reduce_count_inputs"}}]}'
> 
> I hope this helps.
> 
> Best regards,
> 
> Christian
> 
> 
> On 13 Feb 2013, at 18:12, Jeremiah Peschka <jeremiah.peschka at gmail.com> wrote:
> 
>> Is this documented anywhere on the docs.basho.com site? 
>> 
>> Searching for $bucket produces search results just for "bucket" and Google says "No results found for site:docs.basho.com $bucket."
>> 
>> ---
>> Jeremiah Peschka - Founder, Brent Ozar Unlimited
>> MCITP: SQL Server 2008, MVP
>> Cloudera Certified Developer for Apache Hadoop
>> 
>> 
>> On Wed, Feb 13, 2013 at 10:08 AM, Christian Dahlqvist <christian at basho.com> wrote:
>> Hi,
>> 
>> In addition to the $key index, there is also a $bucket index available by default. This contains the name of the bucket, and can be used to get all keys in a specific bucket.
>> 
>> Best regards,
>> 
>> Christian
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20130214/6e900f70/attachment.html>


More information about the riak-users mailing list