ListKeys or MapReduce

Jeremiah Peschka jeremiah.peschka at gmail.com
Tue Feb 12 14:44:09 EST 2013


...and fixed!

You can get this right now if you're adventurous and want to build
CorrugatedIron from source by grabbing the develop branch [1]. We have
several other issues to clean up and verify before we release CI 1.1.1 in
the next day or so. Or you can download it from [2] if you don't want to
build yourself and don't want to wait for NuGet. Once we put 1.1.1 to NuGet
we'll respond to this thread or email you directly.

I make no guarantees that the new DLL won't eat your hard drive or turn
your computer into a killer robot.

[1]: https://github.com/DistributedNonsense/CorrugatedIron/tree/develop
[2]:
http://clientresources.brentozar.com.s3.amazonaws.com/CorrugatedIron-111-alpha.zip

---
Jeremiah Peschka - Founder, Brent Ozar Unlimited
MCITP: SQL Server 2008, MVP
Cloudera Certified Developer for Apache Hadoop


On Tue, Feb 12, 2013 at 11:13 AM, Jeremiah Peschka <
jeremiah.peschka at gmail.com> wrote:

> Good news! You've found a bug in CorrugatedIron. Because of index naming,
> we muck index names to have a suffix of _bin or _int, depending on the
> index type. This shouldn't be happening on $key, but it is. I'll create a
> github issue and get that taken care of.
>
> ---
> Jeremiah Peschka - Founder, Brent Ozar Unlimited
> MCITP: SQL Server 2008, MVP
> Cloudera Certified Developer for Apache Hadoop
>
>
> On Tue, Feb 12, 2013 at 7:56 AM, Kevin Burton <rkevinburton at charter.net>wrote:
>
>> I forgot to mention that when I execute this code I get the error:****
>>
>> ** **
>>
>>                                         {not_found,****
>>
>>                                          {<<"products">>,****
>>
>>                                           <<"$keys">>},****
>>
>>                                          undefined}}}:[{mochijson2,****
>>
>>                                                         json_encode,2,***
>> *
>>
>>                                                         [{file,****
>>
>>
>> "src/mochijson2.erl"},****
>>
>>                                                          {line,149}]},***
>> *
>>
>>                                                        {mochijson2,****
>>
>>
>>                     '-json_encode_array/2-fun-0-',****
>>
>>                                                         3,****
>>
>>                                                         [{file,****
>>
>>
>> "src/mochijson2.erl"},****
>>
>>                                                         {line,157}]},****
>>
>>                                                        {lists,foldl,3,***
>> *
>>
>>
>> [{file,"lists.erl"},****
>>
>>                                                          {line,1197}]},**
>> **
>>
>>                                                        {mochijson2,****
>>
>>
>> json_encode_array,2,****
>>
>>                                                         [{file,****
>>
>>
>>                                              "src/mochijson2.erl"},****
>>
>>                                                          {line,159}]},***
>> *
>>
>>                                                        {riak_kv_pb_mapred,
>> ****
>>
>>                                                         process_stream,3,
>> ****
>>
>>                                                         [{file,****
>>
>>
>> "src/riak_kv_pb_mapred.erl"},****
>>
>>                                                          {line,97}]},****
>>
>>
>>                                                      {riak_api_pb_server,
>> ****
>>
>>                                                         process_stream,5,
>> ****
>>
>>                                                         [{file,****
>>
>>
>>               "src/riak_api_pb_server.erl"},****
>>
>>                                                          {line,227}]},***
>> *
>>
>>
>> {riak_api_pb_server,****
>>
>>                                                         handle_info,2,***
>> *
>>
>>                                                         [{file,****
>>
>>
>> "src/riak_api_pb_server.erl"},****
>>
>>                                                          {line,158}]},***
>> *
>>
>>                                                        {gen_server,****
>>
>>                                                         handle_msg,5,****
>>
>>                                                         [{file,****
>>
>>
>>                                            "gen_server.erl"},****
>>
>>                                                          {line,607}]}] -
>> CommunicationError****
>>
>> ** **
>>
>> ** **
>>
>> *From:* riak-users [mailto:riak-users-bounces at lists.basho.com] *On
>> Behalf Of *Kevin Burton
>> *Sent:* Tuesday, February 12, 2013 9:48 AM
>> *To:* 'Jeremiah Peschka'
>> *Cc:* 'riak-users'
>> *Subject:* RE: ListKeys or MapReduce****
>>
>> ** **
>>
>> The name is “$keys”? Something like:****
>>
>> ** **
>>
>>             using (IRiakEndPoint cluster = RiakCluster.FromConfig(
>> "riakConfig"))****
>>
>>             {****
>>
>>                 IRiakClient riakClient = cluster.CreateClient();****
>>
>>                 RiakBucketKeyInput bucketKeyInput = new
>> RiakBucketKeyInput();****
>>
>>                 bucketKeyInput.AddBucketKey(productBucketName, "$keys");*
>> ***
>>
>>                 RiakMapReduceQuery query = new RiakMapReduceQuery()****
>>
>>                    .Inputs(bucketKeyInput)****
>>
>>                    .MapJs(m => m.Name("Riak.mapValuesJson").Keep(true));*
>> ***
>>
>>                 RiakResult<RiakMapReduceResult> result =
>> riakClient.MapReduce(query);****
>>
>>                 if (result.IsSuccess)****
>>
>>                 {****
>>
>> ** **
>>
>> ** **
>>
>> *From:* Jeremiah Peschka [mailto:jeremiah.peschka at gmail.com<jeremiah.peschka at gmail.com>]
>>
>> *Sent:* Tuesday, February 12, 2013 9:18 AM
>> *To:* Kevin Burton
>> *Cc:* riak-users
>> *Subject:* Re: ListKeys or MapReduce****
>>
>> ** **
>>
>> It would be queried like any other index as an MR input. I'll create an
>> issue and will try to get this in some time in the next few days - no
>> promises, though.****
>>
>>
>> ****
>>
>> ---****
>>
>> Jeremiah Peschka - Founder, Brent Ozar Unlimited****
>>
>> MCITP: SQL Server 2008, MVP****
>>
>> Cloudera Certified Developer for Apache Hadoop****
>>
>> ** **
>>
>> On Tue, Feb 12, 2013 at 7:09 AM, Kevin Burton <rkevinburton at charter.net>
>> wrote:****
>>
>> I will read the other URLs that you mentioned. Thank you.****
>>
>>  ****
>>
>> Would you mind giving a short example (preferably using CI) of the $keys
>> index?****
>>
>>  ****
>>
>> *From:* Jeremiah Peschka [mailto:jeremiah.peschka at gmail.com]
>> *Sent:* Tuesday, February 12, 2013 8:52 AM
>> *To:* Kevin Burton
>> *Cc:* riak-users
>> *Subject:* Re: ListKeys or MapReduce****
>>
>>  ****
>>
>> They're both pretty crappy in terms of performance - they read all data
>> off of disk. If you're using LevelDB you can use the $keys index to pull
>> back just the keys that in a single bucket.****
>>
>>  ****
>>
>> A better approach is to maintain a separate bucket - e.g. DocumentCount -
>> that is used for counting documents. Unfortunately, you can't guarantee
>> transactional consistency around counts in Riak today, so you'll want to
>> move maintaining the counts out of Riak and into something else. If you
>> search the list archives [1], you'll find that Redis has been mentioned as
>> a good way to solve this problem - counters are stored in Redis and flushed
>> to Riak on a regular schedule. Because of the lack of consistency
>> (especially around MapReduce operations), Riak isn't the best choice if you
>> require counters/aggregations to be stored in the database.****
>>
>>  ****
>>
>> Once CRDTs [2] make it into mainstream Riak, you can make use of those
>> data structures to implement distributed counters in Riak.****
>>
>>  ****
>>
>> [1]: http://riak.markmail.org****
>>
>> [2]: http://vimeo.com/52414903****
>>
>>
>> ****
>>
>> ---****
>>
>> Jeremiah Peschka - Founder, Brent Ozar Unlimited****
>>
>> MCITP: SQL Server 2008, MVP****
>>
>> Cloudera Certified Developer for Apache Hadoop****
>>
>>  ****
>>
>> On Mon, Feb 11, 2013 at 10:30 AM, <rkevinburton at charter.net> wrote:****
>>
>> Say I need to determine how many document there are in my database. For a
>> CorrugatedIron application I can do ListKeys and get the warning that it is
>> an expensive operation or I can do a MapReduce query. Which is the the
>> least expensive? Is there an option that I am missing?****
>>
>>
>> _______________________________________________
>> riak-users mailing list
>> riak-users at lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com****
>>
>>  ****
>>
>> ** **
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20130212/306586a4/attachment.html>


More information about the riak-users mailing list