The suitability of MapReduce

Matt Black matt.black at jbadigital.com
Mon Apr 8 19:26:34 EDT 2013


I think an short and explicit discussion of using sequential GETs would be
good to add to the docs in [1]. It'll be helpful to put the alternate
option in the reader's head so they can evaluate as they're going through
the article.

Cheers
Matt


On 9 April 2013 02:02, Jeremiah Peschka <jeremiah.peschka at gmail.com> wrote:

> I want to follow up on the recent "Map phase timeout" thread [2]. In part
> out of curiosity and in part as a documentation clean up... Should the
> documentation at [1] be changed? Specifically, the docs say MR should be
> used:
>
>    - *When you know the set of objects you want to MapReduce over (the
>    bucket-key pairs) *(emphasis added)
>    - When you want to return actual objects or pieces of the object – not
>    just the keys, as do Search & Secondary Indexes
>    - When you need utmost flexibility in querying your data. MapReduce
>    gives you full access to your object and lets you pick it apart any way you
>    want.
>
> It seems to me that a lot of discussions around MR in Riak come down to
> "You're close but this isn't the best use case of MapReduce in Riak." Would
> it be better, for the purposes of a general discussion, to say that
> MapReduce is the appropriate paradigm when you want to:
>
>    - manipulate a large amount of data inside the Riak cluster in bulk -
>    e.g. read all of my sales orders and where the version is 1, perform the
>    changes necessary to update the order format to version 2.
>    - burn a lot of I/O and make your admin sad
>    - move data from one bucket to another
>    - re-write an entire bucket so all data is indexed for 2i, search, etc
>    - Anything where the query can be resumed with no knowledge of state
>    at the time the last run of the query failed.
>
> Are there other use cases when MR is the better approach?
>
> [1]:
> http://docs.basho.com/riak/latest/tutorials/querying/MapReduce/#When-to-Use-MapReduce
> [2]:
> http://riak.markmail.org/search/?q=#query:+page:1+mid:4o27v64qf55ejzwc+state:results
>
>  ---
> Jeremiah Peschka - Founder, Brent Ozar Unlimited
> MCITP: SQL Server 2008, MVP
> Cloudera Certified Developer for Apache Hadoop
>
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20130409/e284935f/attachment.html>


More information about the riak-users mailing list