ListKeys or MapReduce

Jeremiah Peschka jeremiah.peschka at gmail.com
Tue Feb 12 10:17:44 EST 2013


It would be queried like any other index as an MR input. I'll create an
issue and will try to get this in some time in the next few days - no
promises, though.

---
Jeremiah Peschka - Founder, Brent Ozar Unlimited
MCITP: SQL Server 2008, MVP
Cloudera Certified Developer for Apache Hadoop


On Tue, Feb 12, 2013 at 7:09 AM, Kevin Burton <rkevinburton at charter.net>wrote:

> I will read the other URLs that you mentioned. Thank you.****
>
> ** **
>
> Would you mind giving a short example (preferably using CI) of the $keys
> index?****
>
> ** **
>
> *From:* Jeremiah Peschka [mailto:jeremiah.peschka at gmail.com]
> *Sent:* Tuesday, February 12, 2013 8:52 AM
> *To:* Kevin Burton
> *Cc:* riak-users
> *Subject:* Re: ListKeys or MapReduce****
>
> ** **
>
> They're both pretty crappy in terms of performance - they read all data
> off of disk. If you're using LevelDB you can use the $keys index to pull
> back just the keys that in a single bucket.****
>
> ** **
>
> A better approach is to maintain a separate bucket - e.g. DocumentCount -
> that is used for counting documents. Unfortunately, you can't guarantee
> transactional consistency around counts in Riak today, so you'll want to
> move maintaining the counts out of Riak and into something else. If you
> search the list archives [1], you'll find that Redis has been mentioned as
> a good way to solve this problem - counters are stored in Redis and flushed
> to Riak on a regular schedule. Because of the lack of consistency
> (especially around MapReduce operations), Riak isn't the best choice if you
> require counters/aggregations to be stored in the database.****
>
> ** **
>
> Once CRDTs [2] make it into mainstream Riak, you can make use of those
> data structures to implement distributed counters in Riak.****
>
> ** **
>
> [1]: http://riak.markmail.org****
>
> [2]: http://vimeo.com/52414903****
>
>
> ****
>
> ---****
>
> Jeremiah Peschka - Founder, Brent Ozar Unlimited****
>
> MCITP: SQL Server 2008, MVP****
>
> Cloudera Certified Developer for Apache Hadoop****
>
> ** **
>
> On Mon, Feb 11, 2013 at 10:30 AM, <rkevinburton at charter.net> wrote:****
>
> Say I need to determine how many document there are in my database. For a
> CorrugatedIron application I can do ListKeys and get the warning that it is
> an expensive operation or I can do a MapReduce query. Which is the the
> least expensive? Is there an option that I am missing?****
>
>
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com****
>
> ** **
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20130212/6f41c3f0/attachment.html>


More information about the riak-users mailing list