indices with large number of objects on same key

Jeremiah Peschka jeremiah.peschka at
Mon Mar 5 20:50:41 EST 2012

On Mar 5, 2012, at 1:17 AM, Telmo Menezes wrote:

> So I have the situation where Riak objects contain a list of keys
> referencing other objects. There are cases where objects contain a
> very high number of such references. At some point, it becomes
> unreasonable to just store this entire list in the object.
> Is there some way of dealing with this problem in Riak that I'm missing?
> Secondary indicies seem like a possible solution: I could just tag the
> referenced objets with the id of the referrer. The problem is that the
> documentation is very unclear about what happens when a large number
> of values have the same tag. If a query for this tag, is there a
> reasonable way to get a high number of results? Any sort of pagination
> or streaming?

Riak doesn't support pagination, but it does support streaming. You can perform streaming MR operations to consume the results of the last phase of the MapReduce job. I have no idea which clients support this functionality (outside of the C# client). Although I would suspect most would support it.

From [1]:

Q: Although streams weren't mentioned, do you have any recommendations on when to use streaming map/reduce versus normal map/reduce?

Streaming MapReduce sends results back as they get produced from the last phase, in a multipart/mixedformat. To invoke this, add ?chunked=true to the URL when you submit the job. Streaming might be appropriate when you expect the result set to be very large and have constructed your application such that incomplete results are useful to it. For example, in an AJAX web application, it might make sense to send some results to the browser before the entire query is complete.


Jeremiah Peschka - Managing Director, Brent Ozar PLF, LLC
Microsoft SQL Server MVP

More information about the riak-users mailing list