Future roadmap for indexed queries?

Rusty Klophaus rusty at basho.com
Sat Feb 27 08:24:25 EST 2010

Hi Preston,

Map/Reduce in Riak works a bit differently than Map/Reduce in Couch. In
Riak, you can think of Map/Reduce as a mini-Hadoop job, where you pass in a
set up input keys, and then define a chain of Map and Reduce phases that
operate on that data.

The Map phase runs a map function once for each input key/object. This
happens in parallel, and the work is distributed across your cluster, so the
function actually runs on the node where your data lives. The results of
this Map function are cached in memory at an object level until the object
is changed or deleted. The Map phase can return data or another list of
keys. (As you correctly mentioned, only Map phases with "named" functions,
not anonymous functions, are cached.)

The Reduce phase gathers the output from a Map phase and can either
aggregate data in some way, or produce a new list of keys.

So to get back to your email, data is cached in Map/Reduce not when you have
added new data, but rather after you query on that data. At that point, new
queries that would touch the same data can use the cached results instead.

I sent another email out on this thread a few minutes ago that talks about
pre-commit hooks. These would allow you to "pre-cache" query results when
you add new data.


On Fri, Feb 26, 2010 at 4:57 PM, Preston Marshall <preston at synergyeoc.com>wrote:

> John, Riak allows you to cache mapreduce queries if you create them as
> named queries.  In a way, this kind of creates an index, because I think it
> works a bit like CouchDB where the caches are added to when there is new
> data.  I think Riak also has a similar query paradigm to CouchDB, which is
> dynamic data, static queries.  I may be completely wrong here, so feel free
> to correct me.
> Thanks,
> Preston
> On Feb 26, 2010, at 3:53 PM, John Lynch wrote:
> I am preparing to give a talk on Riak next week to a local Ruby user group
> here in San Diego, and wanted to get your thoughts on the future of Riak.
> While in its current form it is awesome for loads of use cases, it falls
> short in the whole querying department, at least as it relates to building
> typical web applications. I get the map/reduce features, and the link
> features, but are there any plans to build in indexed query capabilities, a
> la MongoDB?  Mongo obviously has an easier time of this since it knows
> exactly what format your data is in (BSON). Or, is this something you would
> leave to third-parties to build as part of an ORM framework like Ripple, for
> example, which would have a better idea of the shape of the data and could
> create/maintain indexes accordingly...
> Regards,
> John Lynch, CTO
> Rigel Group, LLC
> john at rigelgroupllc.com
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> _______________________________________________
> riak-users mailing list
> riak-users at lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20100227/f8a84370/attachment.html>

More information about the riak-users mailing list